Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Demand Estimation from Censored Observationswith Inventory Record Inaccuracy
Adam J. MersereauKenan-Flagler Business School, University of North Carolina
A retailer cannot sell more than it has in stock; therefore its sales observations are a censored representation
of the underlying demand process. When a retailer or other firm forecasts demand based on past sales
observations, the firm requires an estimation approach that accounts for this censoring. Several authors have
analyzed demand learning in censored environments, but they assume that inventory levels are known and
hence that stockouts are observed. However, for a variety of reasons, firms often do not know how many units
are available to meet demand. In this paper, we investigate the impact of this unknown on demand estimation
and subsequent stocking levels in a censored environment. When the firm does not account for inventory
uncertainty when estimating demand, we discover a systematic downward bias in demand estimates. We
propose and test a heuristic prescription that relies on a single error statistic and sharply reduces this bias.
Our work identifies and measures a new component of the value of tracking technologies such as RFID.
This version : April 17, 2012.
1. Introduction
There has been a surge of interest in recent years in the problem of “inventory record inaccuracy.”
For a variety of reasons, firms’ electronic inventory records do not necessarily match the quantity of
customer-available stock. This uncertainty brings several costs (DeHoratius et al. 2008). Firms must
deploy resources to count stock and reconcile inventory records. Uncertain inventory records hinder
the firm’s ability to match supply with demand, leading to higher inventory and stockout costs.
The current paper exposes another source of cost due to record inaccuracy, namely its confounding
of the firm’s demand estimation. In this way, our findings are relevant for the evaluation of tracking
technologies such as Radio Frequency Identification (RFID).
While the majority of stochastic inventory models in the academic literature assume that the
probability distribution governing demand is fixed and known, the reality is that firms’ forecasts
of demand are perpetually evolving as data is accumulated. In many settings (e.g., retail) the firm
observes sales data that are censored representations of demand. That is, sales in a period are the
minimum of demand in the period and the amount of stock available. Sales are always less than or
equal to demand, and Conrad (1976) demonstrates that a firm who assumes that sales and demand
1
2
are equivalent will stock its shelves suboptimally. To correctly learn demand from sales data, the
firm requires an estimation approach that properly accounts for the censoring. But it is difficult for
a firm to properly account for censoring when erroneous inventory records prevent the firm from
accurately assessing when censoring occurs.
In this paper, we investigate the impact of inventory record inaccuracy on demand estimation and
subsequent stocking levels in a censored environment. The firm chooses a stocking level each period
in the face of an unknown demand distribution. We assume that the firm is sophisticated with
respect to demand estimation, and we borrow the parametric Bayesian demand learning models of
Lariviere and Porteus (1999) and Ding et al. (2002) to account for censoring. However, most of our
analysis assumes that the firm does not account for record inaccuracy when estimating demand.
This appears to be common practice. While we are aware of several retailers who use methods to
correct for censoring, it seems uncommon for retailers to take record inaccuracy into account when
estimating demand. We examine the estimation biases resulting from such a mischaracterization,
and we characterize fixed points of this misspecified estimation process. We discover a systematic
downward bias in demand estimates, even while service levels may appear to the firm to meet
or exceed targets. An implication is that ignoring record inaccuracy can negatively impact the
profitability of a firm even as the firm may fail to detect the problem.
An effective RFID implementation can potentially mitigate the problem by providing the seller
better visibility into censoring, but at substantial expense. In the absence of an RFID implemen-
tation, we propose a heuristic prescription to the problem that relies on a single inventory error
statistic (namely, the probability of a positive, or stock-reducing, error) and substantially reduces
the estimation bias.
Our work complements existing research by arguing for a better understanding of record inac-
curacy and more efforts towards its mitigation. We point out that much of the academic research
on record inaccuracy and RFID has focused on their implications on replenishment policies. Our
work is relevant to a broader set of firms—for example, fashion retailers who engage in limited
replenishment but for whom learning about demand trends is important.
The paper is organized as follows. We review related literature in Section 2. Section 3 introduces
our notation and model setting and reviews some background on demand learning from censored
observations. We present in Section 4 a basic result suggesting a downward bias in demand esti-
mates under record inaccuracy under fairly general assumptions. In Section 5, we provide a detailed
analysis for the case of exponential demand, where we make use of concise demand updating equa-
tions under censored demand established by Braden and Freimer (1991). Specifically, Section 5.1
3
characterizes fixed points of the firm’s estimation and stocking process given this misspecification.
Numerical results in Section 5.2 lead naturally to a discussion of prescriptions in Section 6. Section
7 presents a numerical framework (including the Metropolis algorithm to perform the Bayes update
of demand beliefs) for studying the estimation bias under discrete demand distributions. Using this
framework, we corroborate many of our earlier findings for the case of Poisson demand. We discuss
future research opportunities in Section 8.
2. Related Literature
Our research lies at the intersection of two significant but distinct research streams. The first
concerns estimating demand and optimizing inventory policies based on censored observations.
Wecker (1978), Nahmias (1994), and Agrawal and Smith (1996) present methods for estimating
demand parameters from sales observations when stocking levels are known. Especially relevant
to our work are parametric Bayesian models for learning demand from censored observations,
which date to Harpaz et al. (1982). Lariviere and Porteus (1999), making use of the “newsvendor
distribution” framework of Braden and Freimer (1991) and assuming that inventory is perishable,
show that an optimal forward-looking seller in a censored-demand setting will stock more than a
myopic seller. Ding et al. (2002) extend this result to a more general set of continuous distributions
(see also notes by Lu et al. 2005, 2008 and Bensoussan et al. 2009b). Our work makes use of the
framework assumed in these papers in that we assume a repeated newsvendor setting with demand
censoring and a parametric Bayesian model for learning demand. We review this model in Section
3.1 and the newsvendor distribution framework in Section 3.2. For the special case of exponential
demand that we assume in Section 5, Bisi et al. (2011) and Bensoussan et al. (2009a) offer detailed
results on policies and estimation consistency.
The results of Lariviere and Porteus (1999) and Ding et al. (2002) have been extended by
several authors. Chen and Plambeck (2008), Lu et al. (2007), Chen (2010), and Bisi et al. (2011)
consider perishable inventory. A number of works, including Burnetas and Smith (2000), Huh
and Rusmevichientong (2009), and Huh et al. (2011), consider learning of censored demand in
non-parametric settings. Besbes and Muharremoglu (2012) establish the theoretical importance of
censoring information through an asymptotic analysis, particularly when demand takes discrete
values. Nevertheless, in all the works we are aware of on learning from censored demand, a key
implicit assumption is that inventory positions are known so that stockuts are accurately observed.
The second research stream most relevant to our work is the literature on inventory record
4
inaccuracy and inventory misplacement.1 Work on inventory management with record inaccuracy
dates to Iglehart and Morey (1972). Recent interest in the problem follows empirical studies of
record inaccuracy and inventory misplacement (e.g., DeHoratius and Raman 2008, Ton and Raman
2006) and accompanies the emergence of new tracking technologies such as RFID (Lee and Ozer
2007). Several analytical studies have looked at the implications of inventory inaccuracies in supply
chain settings (e.g., Heese 2007, Gaukler et al. 2007, Camdereli and Swaminathan 2010, Kok and
Shang 2010), but much of the work in this area concerns inventory and audit policies for single-item,
single-location systems with inventory inaccuracy (Kang and Gershwin 2005, Kok and Shang 2007,
Bensoussan et al. 2007, DeHoratius et al. 2008, Rekik et al. 2008, Atali et al. 2011, Mersereau 2011).
These papers differ from each other largely in the details underlying their inventory, error, and
information models. Many of these papers estimate the value of inventory visibility by comparing a
naive or ignorant seller (who is oblivious to inventory inaccuracies), an informed seller (who knows
the distribution but not the realizations of inventory inaccuracies) and an RFID-enabled seller (who
has full visibility into customer-available stock). The papers then measure the value of visibility as
the improvement in decision-making achieved through better information. For example, the value
of RFID is typically claimed to be the difference in performance between the RFID-enabled seller
and the naive or informed sellers.
These papers differ from our work most significantly in that they all assume that demand prob-
ability distributions (and typically also error distributions) are known. In our model, the demand
distribution is unknown and learned from (censored) observations. We model a naive seller who is
oblivious to the error process. We compare this naive seller with a “tracking” (or RFID-enabled)
seller who observes error realizations, and we propose a heuristic for an informed seller who is
aware of errors and has limited knowledge of their distribution. Our focus is the impact of inventory
record inaccuracy on the seller’s ability to estimate demand, in contrast to previous works that all
measure the impact of inventory record inaccuracy on matching supply with demand given known
demand processes. DeHoratius and Ton (2009) survey the literature on inventory record inaccu-
racy and identify an “opportunity to incorporate execution problems into models that estimate
demand and assess forecast accuracy in the presence of stockouts” (p. 67). We believe we are the
first to investigate this interaction between inventory inaccuracies and demand estimation. Our
1 DeHoratius and Ton (2009) distinguish between “inventory record inaccuracy” (where the quantity of inventory ina firm’s facility does not match the firm’s inventory record) and “misplaced products” (where some fraction of theon-hand inventory is in the facility but unavailable to customers). Our interest is in any situation in which a firm’sinventory records do not reflect customer-available inventory. We model an error process that may be a combinationof both types of errors, and we refer to it as “inventory record inaccuracy” for simplicity.
5
work shows that previous assessments of RFID in the literature may underestimate its value by
assuming known demand distributions.
Much of our paper analyzes a naive seller who estimates demand and makes stocking decisions
while oblivious to the existence of inventory inaccuracies. Thus the naive seller makes an incorrect
assumption about the process generating her observations. A small literature has emerged on
misspecified demand models in operations management (Cachon and Kok 2007, Cooper et al. 2006,
2009). Lee et al. (2009) investigates convergence in a general repeated newsvendor setting in which
the seller stocks according to a critical fractile-based policy and demand subsequently depends on
the stocked quantity. This bears some resemblance to our problem, although the misspecification
in their model involves an endogenous demand-generating process rather than uncertainty around
stocking levels.
3. Preliminaries
In this section we build our model by first reviewing the classical Bayesian censored newsvendor
model, including the “newsvendor distribution” framework, to which we then add a model of
inventory record errors.
3.1. Review of Demand Estimation under Censoring
Here we review the censored newsvendor problem and introduce some notation without the com-
plication of inventory record errors. We will add errors to the model in Section 3.3. We assume
that a seller stocks a quantity jt of units in period t to satisfy random demand. We assume that
the demand distribution is unknown. In particular, we assume that demand dt in period t is drawn
from a stationary probability distribution F (·|θ). The functional form F (·|θ) is known, and θ is a
fixed but unknown parameter. Conditional on θ, demand is independent across periods.
At the end of period t, the seller observes sales st = min{dt, jt}, and unfulfilled demand is
unobserved. Therefore, if the seller observes st < jt then she assumes that dt = st; if the seller
observes st = jt then she assumes that dt ≥ st.
The seller begins with a prior density π0 on θ that she updates in a Bayesian fashion each period
based on sales observations. For the purpose of computing a posterior distribution of θ at period
t, the seller requires not only the history of sales St = (s1, . . . , st) but also a vector Ct = (c1, . . . , ct)
of indicators indicating whether or not there was a stockout in each time period. We let ct = 1 if
st = jt and ct = 0 if st < jt.
6
Given this history, we can compute the posterior density πt(·|St,Ct) of θ at time t. Assum-
ing that demand takes discrete, integer values, let f(·|θ) indicate the probability mass function
corresponding with distribution F (·|θ). The posterior density πt(·|St,Ct) is then:
πt(θ|St,Ct) ∝ π0(θ)t∏
τ=1
[1l{cτ = 1}(1−F (sτ − 1|θ)) + 1l{cτ = 0}f(sτ |θ)]
∝ π0(θ)t∏
τ=1
(1−F (sτ − 1|θ))cτ f(sτ |θ)1−cτ . (1)
If demand instead follows a continuous distribution with density f(·|θ), the posterior density is:
πt(θ|St,Ct) ∝ π0(θ)t∏
τ=1
[1l{cτ = 1}(1−F (sτ |θ)) + 1l{cτ = 0}f(sτ |θ)]
∝ π0(θ)t∏
τ=1
(1−F (sτ |θ))cτ f(sτ |θ)1−cτ . (2)
We denote the predictive distribution (after unconditioning on θ) after period t by Ft(d|St,Ct).
3.2. Review of Newsvendor Distributions
An interesting and tractable set of special cases arises when we assume the demand distribution
falls within the family of “newsvendor distributions” identified by Braden and Freimer (1991)
and further studied by Lariviere and Porteus (1999). Here we briefly review relevant results on
newsvendor distributions.
A newsvendor distribution has the form
F (d|θ) = 1− exp{−θg(d)}, d∈R+
where the function g() is positive, differentiable, and increasing for d≥ 0; also, limd→0+ g(d) = 0 and
limd→∞ g(d) =∞. An example is the exponential distribution, which is a newsvendor distribution
with g(d) = d.
The gamma distribution is a conjugate prior for all newsvendor distributions. If we let a0 and
b0 indicate the shape and scale parameters, respectively, of a gamma prior density on θ,
π0(θ|a0, b0) =ba00 θ
a0−1e−b0θ
Γ(a0),
then the posterior distribution of θ given observations St and Ct is gamma with scale and shape
parameters
at = a0 +t∑
τ=1
(1− cτ ) (3)
7
bt = b0 +t∑
τ=1
g(sτ ). (4)
This gamma posterior has mean at/bt and variance at/b2t , and the seller’s predictive density of
demand in period t+ 1 is
f(dt+1|at, bt) =atb
att g′(dt+1)
[bt + g(dt+1)]at+1. (5)
The parameters at and bt are sufficient statistics for the demand parameter θ in the Bayes
sense. This is convenient for computation. Moreover, this implies that, in the case in which a
newsvendor demand distribution is assumed, the matching of ct with its corresponding st becomes
unimportant. That is, the seller need only keep track of the total number of censored observations
and an aggregate measure of sales without regard to which sales observations were censored and
which were not.
Braden and Freimer (1991) argue that the class of newsvendor distributions is a versatile subset
of continuous distributions. However, there appear not to be analogous results for discrete distri-
butions commonly used to model demand, like the Poisson distribution. Section 7 considers the
case of Poisson demand.
3.3. Model of Inventory Record Inaccuracy
We model inventory record inaccuracy by distinguishing between the quantity jt the seller orders
and the quantity it that is in fact customer-available. A discrepancy between it and jt may arise from
a variety of sources: merchandise theft or damage, recording or counting errors, or misplacement.
We assume that it = {jt− εt}+, where εt is a random variable drawn from an error distribution H.
While in some practical contexts errors may accumulate over time until the firm performs an
audit, resulting in time-dependent errors, we assume for simplicity that the error process is sta-
tionary and that errors are independent across periods. This is a reasonable model for a repeated
newsvendor of perishable goods whose stock is impacted by errors each period before customers
arrive. For expositional simplicity, we also assume that H is non-degenerate such that Pr{εt = 0}<
1. Except where specified otherwise, we allow H to be continuous, discrete, or a mixture of continu-
ous and discrete distributions. We also define the distribution Kjt of “realized” errors rt = jt− it =
min{εt, jt}. It follows that Kjt(r) =H(r) for r < jt, and Kjt(jt) = 1 (because rt = min{εt, jt} ≤ jt).
We have Kjt =H when H is such that Pr{εt > jt}= 0.
For the sake of comparison, we consider two sellers operating with different information sets:
1. A “naive” seller knows jt and thinks that jt units are available to the customer in period t.
That is, the seller ignores errors or is oblivious to them. We believe that this is representative of
8
Case Realization st crt cnt(A) it ≥ jt >dt dt 0 0(B) jt > it >dt dt 0 0(C) it >dt ≥ jt dt 0 1(D) dt ≥ it ≥ jt it 1 1(E) jt >dt ≥ it it 1 0(F) dt ≥ jt > it it 1 0
Table 1 Sales and stockout observations recorded in period t by tracking and naive sellers given order quantity
jt under all possible demand and error realizations.
common practice. We assume that the naive seller maintains a database of sales Snt = (sn1 , . . . , snt )
and of stockout indicators Cnt = (cn1 , . . . , c
nt ), where cnt = 1 if snt ≥ jt and cnt = 0 if snt < jt, and that
she computes posterior densities πnt (·|Snt ,Cnt ) for θ using equations (1) or (2).
We permit εt < 0, in which case it is possible for the naive seller to observe sales exceeding her
order quantity—that is, snt > jt. This is a zero-probability event given her belief that jt is the true
customer-available inventory level. We assume that the seller interprets such an observation as a
stockout with sales st.
2. A “tracking” seller knows the true customer-available inventory level it, which is plausible
under a perfect implementation of a tracking technology such as RFID. We assume that the tracking
seller maintains a database of sales Srt = (sr1, . . . , srt ) and of stockout indicators Cr
t = (cr1, . . . , crt ),
where crt = 1 if srt = it and crt = 0 if srt < it, and that she computes posterior densities πrt (·|Srt ,Crt )
for θ using equation (1) or (2).
We will explore the impact of inaccurate inventory records by comparing the performances of
the naive and tracking sellers. Table 1 shows the observations recorded by the two sellers, broken
down into six cases. In cases (A), (B), and (D), the naive seller registers the same information as
the tracking seller. Case (C) represents the situation in which the naive seller perceives a stockout
even though there is no physical stockout. We term this a “false stockout.” Cases (E) and (F)
represent situations in which there is a physical stockout that goes unseen by the naive seller. We
term this situation a “missed stockout.”
4. Comparison when Stocking Quantities are Common
Our first result is that, if the naive and tracking sellers stock the same quantity in a period,
the naive seller perceives too few stockouts in the period compared with the tracking seller. This
initial result requires no assumptions on the demand distribution and holds under a plausible set
of conditions on the error distribution H. We subsequently will establish the implications of this
result in the more restrictive setting in which demand follows a newsvendor distribution.
9
Proposition 1. Suppose the error distribution H is such that Pr{ε > 0} ≥Pr{ε < 0} and suppose
that the naive and tracking sellers stock the same quantity j1 in period 1. Then Pr{cn1 = 1}−Pr{cr1 =
1} ≤ 0.
Proof. Observe from Table 1 that
Pr{cn1 = 1}−Pr{cr1 = 1} = Pr{i1 >d1 ≥ j1}+ Pr{d1 ≥ i1 ≥ j1}−Pr{d1 ≥ i1}
= Pr{(d1 ≥ j1)∩ (i1 ≥ j1)}−Pr{d1 ≥ i1}
= Pr{d1 ≥ j1}Pr{ε1 ≤ 0}−Pr{d1 ≥ i1}
≤ Pr{d1 ≥ j1}Pr{ε1 ≥ 0}−Pr{d1 ≥ i1}
= Pr{d1 ≥ j1 ≥ i1}−Pr{d1 ≥ i1}
≤ Pr{d1 ≥ i1}−Pr{d1 ≥ i1}
= 0,
where the third equality relies on the independence of d1 and i1 given j1, and the first inequality
makes use of the assumption Pr{ε > 0} ≥Pr{ε < 0}
The condition Pr{ε > 0} ≥Pr{ε < 0} of Proposition 1 is consistent with the audit results observed
by DeHoratius and Raman (2008), who found positive discrepancies in 59% of inaccurate records
identified in a retail chain’s inventory audit. Furthermore, this condition covers the error assump-
tions made in much of the analytical literature on inventory record inaccuracy. It subsumes non-
negative errors (i.e., Pr{ε≥ 0}= 1), which are reasonable to assume for certain error sources—for
example, theft, damage, or misplacement. This assumption is made in Kang and Gershwin (2005)
and Kok and Shang (2010). Proposition 1 also applies to error distributions that are symmetric
about zero, which is a plausible model of clerical errors and assumed by Kok and Shang (2007).
Finally, Proposition 1 also applies to mixtures of nonnegative and symmetric error distributions,
which is an appropriate model when multiple sources of errors are assumed (e.g., Atali et al. 2011).
If the naive seller is seeing the same sales observations St and too few stockouts compared
with the tracking seller, it is intuitive that the naive seller would therefore underestimate demand
relative to the tracking seller. This is challenging to show in general because the sellers’ beliefs
generally involve the joint distributions of (st, crt ) and (st, c
nt ), which can be complex. However, if
we work with newsvendor distributions, the sellers’ beliefs depend only on the aggregate statistics∑t
τ=1 sτ ,∑t
τ=1(1− crτ ), and∑t
τ=1(1− cnτ ). This enables the following result.
Proposition 2. Suppose that
a) the error distribution H is such that Pr{ε > 0} ≥Pr{ε < 0} (and Pr{ε= 0}< 1),
10
b) demand follows a newsvendor distribution with known g() and unknown parameter θ > 0, on
which the sellers share an arbitrary, non-degenerate prior distribution with well-defined density,
and
c) the naive and tracking sellers stock the same fixed, bounded sequence of quantities 0< jt <M
and experience the same realization of demands and errors.
Then with probability approaching one as t→∞, the tracking seller’s predictive distribution of
demand Ft(·|St,Crt ) will first-order stochastically dominate the naive seller’s predictive distribution
of demand Ft(·|St,Cnt ).
We present the proof in Appendix A. The proof relies on the observation that, if the two sellers
stock the same quantities, they see the same sequence of sales observations. The proof has three
distinct parts. First, we show that the difference Pr{cnt = 1} − Pr{crt = 1} is bounded away from
zero. Second, we show that this implies Pr{∑t
τ=1(1− crτ )<∑t
τ=1(1− cnτ )}→ 1. The stochastic
ordering of the predictive distributions then follows from a result in Braden and Freimer (1991).
Propositions 1 and 2 rely on the assumption that the naive and tracking seller stock the same
sequence of stocking quantities. In practice, however, we would expect the two sellers to stock
quantities based on their evolving beliefs about demand. Although a naive seller underestimates
demand relative to a tracking seller for the same stocking quantity, the naive seller will increasingly
perceive stockouts as her stocking levels decrease. In the following sections, we study the conver-
gence of this process and the magnitude of the naive seller’s bias in steady state. Section 5 pursues
these goals analytically for the special case of exponential demand, and we present a numerical
procedure in Section 7 for discrete distributions.
5. Exponential Demand
The previous section compared the observations made by the naive and tracking sellers given a
common sequence of order quantities. In this section, we assume that the naive seller chooses an
order quantity jt in period t that depends on her own demand estimates. Therefore she engages
in a process of estimate-order-estimate. We seek to characterize fixed points of such a process.
Assuming demand follows an exponential distribution F (d|θ) = 1− exp{−θd} for some θ > 0, we
express a fixed point of the estimate-order-estimate process as the solution to a single equation in
Section 5.1, which we use to gain insights in Section 5.2.
5.1. Characterizing Fixed Points
Here we assume a repeated newsvendor setting in which inventory perishes between periods. We
seek fixed points of the naive seller’s stocking and estimation process. That is, we seek pairs (θ, j)
11
such that (1) if the seller’s demand estimate is fixed at θ > 0 then the seller will stock j > 0, and
(2) if the seller stocks j repeatedly then the seller’s demand parameter estimate converges to θ.
We will numerically demonstrate in Section 5.2 that the fixed points we characterize here describe
steady state behavior of the naive seller’s estimate-order-estimate process.
We assume that demand follows an exponential distribution (which is a newsvendor distribution),
and we assume that the seller has a gamma(a0, b0) prior on θ. This is to simplify our analysis,
although the Bernstein-von Mises theorem (see Le Cam 1986) suggests that the choice of prior is
irrelevant as the number of observations becomes large and thus has no bearing on our results. We
also assume that, given an estimate θ of the demand parameter, the naive seller employs a critical
fractile policy, choosing j to achieve a particular pre-defined service level γ ∈ (0,1) with respect to
the assumed exponential demand distribution F (·|θ), i.e.,
j =−1
θln(1− γ),
which follows directly from setting γ equal to the cdf of the exponential distribution with rate θ.
This is an optimal stocking quantity for a newsvendor with known θ= θ and no inventory errors;
therefore it specifies reasonable limiting behavior of a policy that is learning θ while oblivious to
errors.
Let the random variables i, s and cn represent customer-available inventory, sales, and naive
seller stockout indicator, respectively, if the seller stocks j. Given stocking quantity j, the naive
seller perceives a stockout with probability
E[cn] = Pr{i≥ j}Pr{d≥ j}= Pr{r≤ 0} exp{−θj}= Pr{ε≤ 0} exp{−θj} (6)
and sees average sales
E[s] =
∫ ∞−∞
Ed[min{d, j− r}]dKj(r).
Making use of the memoryless property of the exponential distribution,
Ed[min{d, j− r}] =1
θ(1− exp{−θ(j− r)}),
which yields
E[s] =1
θ− 1
θ
∫ ∞−∞
exp{−θ(j− r)}dKj(r)
=1
θ− 1
θexp{−θj}Mr(θ, j), (7)
where Mr(θ, j)≡ Er[eθr|j] is the moment-generating function of the random variable r evaluated
at θ. We assume this function exists at θ for all j > 0.
12
Suppose the naive seller stocks j each period and that E[j − r]> 0 so that exp{−θ(j − r)}< 1,
then the expected sales E[s] each period is constant and positive and bt/t converges almost surely to
this expectation by the strong law of large numbers. This enables us to show that the naive seller’s
posterior belief converges to the value E[1 − cn]/E[s]. In particular, letting θt denote a random
variable distributed according to the seller’s posterior belief at time t,
E[θt] → limt→∞
(1/t)at(1/t)bt
=limt→∞(1/t)atlimt→∞(1/t)bt
=limt→∞(1/t)
∑t
τ=1(1− cnτ )
limt→∞(1/t)∑t
τ=1 sτ=
E[1− cn]
E[s],
Var[θt] → limt→∞
(1/t2)at(1/t2)b2
t
=E[1− cn]
E[s]limt→∞
1/t
(1/t)bt= 0.
Hence, if the naive seller repeatedly stocks j > 0, her demand parameter estimate converges to
θ= E[1− cn]/E[s].
Substituting the expressions (6) and (7) into θ= E[1− cn]/E[s], we find that a fixed point (θ, j)
must satisfyθ
θ=
1−Pr{ε≤ 0} exp{−θj}1−Mr(θ, j) exp{−θj}
, where j =−1
θln(1− γ). (8)
A solution to equation (8) will have θ ≥ θ if Mr(θ, j)≥ Pr{ε≤ 0} The following proposition gives
an interpretable sufficient condition for this to be true.
Proposition 3. If equation (8) has a finite solution (θ, j) such that E[r|j]≥ 0, then θ≥ θ.
Proof. Let ξ ≡ E[r|j], then
Mr(θ, j) = E[eθr|j
]= eθξE
[eθ(r−ξ)|j
]≥ E
[eθ(r−ξ)|j
]≥ E
[e0]
= 1,
where the first inequality follows from the assumption E[r|j] = ξ ≥ 0 and the fact that eθx is an
increasing function of x when θ > 0, and the second inequality follows from Jensen’s inequality.
Because Pr{ε≤ 0} ≤ 1, we haveMr(θ, j)≥Pr{ε≤ 0} and therefore the right-hand side of equation
(8) is greater than zero. (The finiteness of θ implies that 1−Mr(θ, j) exp{−θj}). Therefore θ≥ θ.
Recall that the parameter (rate) of an exponential distribution is the reciprocal of its mean.
Therefore, θ≥ θ implies that the seller underestimates demand.
Equation (8) is challenging to analyze further (and even to solve numerically) because Mr(θ|j)
depends on the error distribution and θ (through j) in a complex way. This motivates our study
of an important special case of equation (8). In many settings we expect that errors will be small
13
relative to the quantity of inventory stocked. If we restrict ourselves to fixed points where Pr{ε >
j}= 0 then we can replace Mr(θ|j) with Mε(θ)≡ E [eθε], which is a constant with respect to θ and
j. This yieldsθ
θ=
1−Pr{ε≤ 0} exp{−θj}1−Mε(θ) exp{−θj}
, where j =−1
θln(1− γ). (9)
Even when Pr{ε > j} is greater than zero but small (i.e., when γ is relatively large and errors are
small relative to demand), we have found this to be a very good approximation. Furthermore, it is
easy to see that Mr(θ|j)≤Mε(θ) for all j, and a simple modification to the proof of Proposition
3 shows that if E[ε]≥ 0 then any solution θ to equation (9) satisfies θ≥ θ. The appendix provides
a more detailed analysis of equations (8) and (9), including conditions for the existence of their
solutions and a comparison of their solutions.
5.2. Numerical Evaluation
In this section, we compute solutions to the equations derived in the previous subsection. Our pur-
pose is to understand the magnitude and sensitivity of the estimation bias suggested by Proposition
3.
For all results reported in this subsection, we assume θ= 1/20 so that the true demand distribu-
tion is the exponential distribution with mean 20 units. For the instances we consider, equations
(8) and (9) are essentially equivalent for our purposes: the difference between Mε(θ) and Mr(θ|j)
is smaller than 10−8 over all the solutions we report. For simplicity, we therefore solve (9). For
all instances we report, we find two solutions to (9). (Proposition 8 in the appendix provides con-
ditions for equation (9) to have two solutions, and our instances all satisfy these conditions.) In
all cases, one has θ in the vicinity of the true demand mean and one has θ much larger than the
mean demand such that j is very small. In what follows, we report the former solution. We will
demonstrate later that convergence is consistently to this solution.
Table 2 reports numerical solutions to (9) for several critical fractile parameters γ when errors
follow a variety of distributions: (1) normal with mean 0 and variance 1 (Mε(θ) = 1.00125, Pr{ε≤
0}= 0.5), (2) normal with mean 0 and variance 4 (Mε(θ) = 1.00501, Pr{ε≤ 0}= 0.5), (3) a 75/25
mixture of a pulse at zero and normal with mean 0 and variance 4 (Mε(θ) = 1.00125, Pr{ε≤ 0}=
0.875), and (4) a 50/50 mixture of a pulse at zero and exponential with mean 1 (Mε(θ) = 1.02632,
Pr{ε≤ 0}= 0.5). The table provides the following for each instance solved.
• γ: the critical fractile used in the stocking policy. Because the critical fractile policy does not
account for errors and because errors interfere with service, customer perceptions of service will
typically fall below γ.
14
Error Distribution γ Adj. γ Actual SL 1-E[cn] 1/θ0.70 0.680 0.571 0.795 14.840.80 0.780 0.728 0.874 17.100.85 0.831 0.799 0.909 18.00
(1) errors ∼ normal(0,1)0.90 0.881 0.865 0.942 18.780.95 0.931 0.926 0.973 19.440.99 0.971 0.970 0.995 19.900.70 0.661 0.550 0.795 14.780.80 0.761 0.708 0.873 17.070.85 0.812 0.780 0.909 17.98
(2) errors ∼ normal(0,4)0.90 0.862 0.847 0.942 18.760.95 0.912 0.908 0.973 19.440.99 0.952 0.952 0.995 19.900.70 0.690 0.669 0.719 18.870.80 0.790 0.780 0.816 19.350.85 0.840 0.834 0.863 19.54
(3) errors ∼{δ0 w.p. 3/4normal(0,4) w.p. 1/4 0.90 0.890 0.887 0.910 19.71
0.95 0.941 0.940 0.955 19.870.99 0.981 0.981 0.991 19.970.70 0.689 0.580 0.895 14.400.80 0.795 0.743 0.936 16.900.85 0.846 0.816 0.954 17.87
(4) errors ∼{δ0 w.p. 1/2exponential(1) w.p. 1/2 0.90 0.897 0.884 0.971 18.70
0.95 0.949 0.945 0.986 19.410.99 0.990 0.989 0.997 19.89
Table 2 Solutions to equation (9).
• “Adjusted γ”: the (Type 1) service level obtained by the critical fractile policy if θ were known.
The difference between this value and γ represents the impact of errors on service.
• “Actual SL”: the actual (Type 1) service level as seen by customers when j is stocked. This
differs from “Adjusted γ” because the seller’s demand estimate may deviate from the true demand
parameter.
• 1−E [cn]: the service level implied by the stockout indicators observed by the naive seller.
• 1/θ: the naive seller’s estimate of mean demand at the fixed point.
Our results conform to the finding of Proposition 3 in that the naive seller consistently underes-
timates demand. That is, we see 1/θ < 20 for every instance. We see that the magnitude of the bias
depends on the critical fractile parameter γ: the estimation error is small when γ is high and more
significant for small γ. This may be reassuring to retailers who target high service levels, although
due to several observable factors (e.g., upstream supply chain performance), in practice a retailer’s
stocking actions may fall short of its targets even before inventory errors are taken into account
(e.g., Gruen et al. 2002).
15
The magnitude of the bias also depends on the error distribution. The error variances for dis-
tributions (1) and (3) are the same, but we see that they induce very different biases. On the
other hand, (4) has the smallest variance but induces the largest biases. Inspection of equation (9)
reveals that Pr{ε≤ 0} is an important statistic for determining bias, and these results support that
observation. The error distributions (1), (2), and (4) share the same Pr{ε≤ 0}= 0.5 and induce
similar biases to each other, while (3) has Pr{ε≤ 0}= 0.875 and induces smaller biases. We make
use of the statistic Pr{ε≤ 0} in a heuristic prescription described in Section 6.3.
We observe in all of these instances not only that the service level seen by customers (“Actual
SL”) is lower than the target service level (“Adjusted γ”), but also that the naive seller’s estimate
of service level 1−E[cn] exceeds the target service level. These discrepancies are again largest for
relatively low γ and relatively low Pr{ε ≤ 0}. We conclude that the impact of inventory record
inaccuracy on demand estimation can hide, in that the naive seller may believe that it is exceeding
its service level targets even as the customers are observing service levels below the targets.
We conclude this section by addressing the question of convergence to the fixed points identified
by equation (9). Figure 1 summarizes results of simulations of the naive seller’s estimate-order-
estimate process, assuming the seller employs a Bayesian myopic policy in each period t that targets
a (Type 1) service level of γ = 0.85 in her predictive distribution f(·|at, bt). In these simulations,
the true θ= 1/20 and errors are distributed as ε∼ normal(0,4). We start the simulation from three
different gamma priors: one with mean 1/20, one which that overestimates 1/20, and one that
mainly underestimates 1/20. We simulate a repeated newsvendor for 1000 sample paths starting
from each prior. The histogram in Figure 1 represents the reciprocal of the posterior mean at the
end of 3000 periods for each sample path. We observe that the posterior means are clustered around
the fixed point reported in Table 2 (1/θ= 17.98 for this instance), indicated by the vertical line in
the figure. We have seen similar behavior for other instances and other gamma priors, leading us
to conclude that the fixed points reported in Table 2 are representative of steady-state behavior.
6. Prescriptions
Up until now, our focus has been on describing the estimation process of a seller operating under
a misspecification of the sales-generating process. Here, we briefly consider three prescriptions for
the problem, each requiring different sets of information about errors.
6.1. Full Error Information
Clearly, a tracking technology such as RFID has the potential to provide a seller full visibility
into customer-available inventory positions and hence to foster demand estimation on par with
16
15 16 17 18 19 200
50
100
150
200
250
300
350
400
reciprocal of posterior mean after 3000 periods
freq
uenc
y
a0=7.5, b0=37.5a0=7.5, b0=150a0=7.5, b0=300
Figure 1 Posterior means after 3000 time periods for θ = 1/20, γ = 0.85, ε ∼ normal(0,4), and three different
priors on demand.
our tracking seller. In this way our work identifies an important component of the value of such
tracking technologies. As highlighted in Sections 1 and 2, previous research has revealed the impact
of inventory record inaccuracy on the matching of supply and demand, although these studies have
all assumed that demand distributions are known. Our work shows that a tracking technology
has the potential not only to reduce supply-demand mismatch costs but also to yield improved
inventory management through better demand estimation.
6.2. Distributional Error Information
If the seller does not have full information on customer-available inventory levels but knows the
probability distribution H of inventory errors, a conceptually simple but computationally complex
approach is for the seller to incorporate the error uncertainty into its updates of its beliefs around
the demand parameter θ. For simplicity, we develop this for the case in which both F (·|θ) and
H are continuous distributions. In contrast to the case assuming no errors (equation (2)), here it
becomes necessary to condition on the history Jt ≡ (j1, . . . , jt) of stocking quantities rather than
the sequence of indicators Cnt = (cn1 , . . . , c
nt ). (This implies no loss of information, because one can
calculate cnτ = 1l{sτ ≥ jτ} from sτ and jτ .) Letting πHt (θ|St, Jt) indicate the posterior density of θ
at time t when H is known, we have
πHt (θ|St, Jt)∝ π0(θ)t∏
τ=1
q(sτ |θ, jτ ),
where q(sτ |θ, jτ ) is the density representing sales as a function of θ and jτ . The complementary
cumulative demand distribution associated with this density can be written as
Q(sτ |θ, jτ ) =Kjτ (jτ − sτ )F (sτ |θ)
17
with F (s|θ) ≡ 1− F (s|θ) and Kj(r) ≡ 1−Kj(r). The updating simplicity afforded by assuming
a newsvendor demand distribution does not carry over to this update, so an involved numerical
computation is required to compute πH(·|St, Jt). More importantly, this approach requires the
seller to have access to the full distribution H of errors even as she does not know the demand
distribution. For these reasons, we do not pursue this approach further.
6.3. Partial Distributional Error Information
It may be unrealistic to expect a seller who does not know the demand distribution to know the
full probability distribution of errors. Here we suggest a heuristic approach that makes use of a
specific statistic of the error distribution that is easier to estimate than the full error distribution.
In the case of newsvendor demand distributions, the demand “update” using this approach can
be expressed in closed form similar to the case without errors. Assuming exponential demands, we
will show that a “heuristic seller” (that is, a seller employing this approach) will underestimate
demand in the long run, but that heuristic seller’s underestimation of demand will be smaller than
that of the naive seller.
The heuristic is based on the observation from Table 1 that
E[cnt ] = Pr{it >dt ≥ jt}+ Pr{dt ≥ it ≥ jt}= Pr{dt ≥ jt}Pr{εt ≤ 0}.
Therefore E[cnt ]/Pr{εt ≤ 0} is an unbiased estimator for Pr{dt ≥ jt}, which is the expectation of the
tracking seller’s stockout indicator when it = jt. The heuristic is simply to base stocking decisions
on a belief on θ that takes a gamma distribution with parameters (aht , bt), where
aht = max
{1, a0 +
t∑τ=1
(1− cnτ
Pr{ε≤ 0}
)}, (10)
and bt is as defined in equation (4). The “max” in the expression for aht is simply to preclude
aht < 0 for small t, and has no impact on the long-run performance of the heuristic. Beyond this,
a comparison of equations (10) and (3) shows that we are simply dividing the stockout indicators
by Pr{ε≤ 0}.
Assuming exponentially distributed demand and following a similar analysis to the one in Section
5.1, we can characterize fixed points of the estimate-order-estimate process involving the suggested
heuristic. Analogous to equation (8), we find that such fixed points solve
θ
θ=
1− exp{−θj}1−Mr(θ, j) exp{−θj}
, where j =−1
θln(1− γ). (11)
The following proposition establishes that at fixed points, the heuristic seller underestimates
demand, but the bias is smaller in magnitude than that of the naive seller.
18
0.7 0.75 0.8 0.85 0.9 0.95 111
12
13
14
15
16
17
18
19
20
21
γ
1/θn
1/θh
0.7 0.75 0.8 0.85 0.9 0.95 111
12
13
14
15
16
17
18
19
20
21
γ
1/θn
1/θh
Figure 2 Solutions to equations (9) and (11) for θ = 1/20 and ε ∼ normal(0,4) (left) and ε ∼ 0.5δ0 + 0.5 ·exponential(1) (right).
Proposition 4. Suppose that both equations (9) and (11) have finite solutions, and let (θn, jn) be
the solution to (9) with smallest θ and (θh, jh) be the solution to (11) with smallest θ. If E[r|jn]≥ 0
then θn ≥ θh ≥ θ.
Proof. Let Rn(θ) and Rh(θ) indicate the right-hand sides of equations (8) and (11), respectively.
We first show that θn ≥ θh. Because Pr{ε≤ 0} ≤ 1,
Rn(θ) ≡1−Pr{ε≤ 0} exp
{θ
θln(1− γ)
}1−Mr(θ,−(1/θ) ln(1− γ)) exp
{θ
θln(1− γ)
}≥
1− exp{θ
θln(1− γ)
}1−Mr(θ,−(1/θ) ln(1− γ)) exp
{θ
θln(1− γ)
} ≡Rh(θ)
for all θ with Mr(θ,−(1/θ) ln(1−γ)) exp{θ
θln(1− γ)
}< 1. Because limθ→0+ R
h(θ) = 1> 0, we know
that Rh(θ)> θθ
for θ < θh. Therefore, θn
θ=Rn(θn)≥Rh(θn) implies θn ≥ θh.
Second, we show θh ≥ θ. Because θn ≥ θh, we have jn ≤ jh, which implies Mr(θ, jh)≥Mr(θ, j
n)≥
1, where the second inequality is shown in the proof of Proposition 3. It then directly follows from
equation (11) that θh ≥ θ.
We can explore the impact of the heuristic by numerically solving equation (11).2 Figure 2
compares the solutions to equation (11) with those from equation (9) by plotting 1/θn and 1/θh for
normal(0,4) errors and for a mixture of zero and exponential(1) errors. We observe that the heuristic
sharply reduces the bias for all critical fractiles γ and for both error distributions, although the
heuristic is relatively less effective for the mixture of exponential and zero. This is expected given
2 Much as we did in Section 5.2, we solve a version of (11) with Mr(θ, j) replaced by Mε(θ). The difference is negligiblefor the instances we consider.
19
Inputs: Data (St,Ct), “starting” distribution p0(θ), “jumping” distribution J(θ|θi−1).
Steps:Draw θ0 from p0.For i= 1,2, . . .
Sample θ∗ from J(·|θi−1).Calculate density ratio
ρi =πt(θ
∗|St,Ct)πt(θi−1|St,Ct)
.
Set
θi =
{θ∗ w.p. min{ρi,1}θi−1 otherwise.
Endfor
Outputs: Samples θ0, θ1, θ2, . . . .
Figure 3 Metropolis algorithm specialized to generate samples from πt(θ|St,Ct).
that the right-hand side of equation (11) deviates from one only inasmuch as Mr(θ, j) 6= 1, and
given that Mr(θ, j) = 1.02632 for this distribution versus 1.00501 for the normal error distribution.
7. Discrete Demand Distributions
A limitation of the analyses in Sections 5 and 6.3 are their assumptions of exponential demand. It is
of interest to explore the extent to which our findings extend to other demand forms. For example,
retail demand is often modeled as discrete, following a Poisson or negative binomial probability
distribution. As Braden and Freimer (1991) suggest, there does not seem to be a meaningful data
reduction for these demand distributions as for the newsvendor distributions described in Section
3.2. This makes computation of the Bayes update of equation (1) considerably more complicated,
and impossible to meaningfully express in closed form. In this section we present a framework for
numerically evaluating fixed points of the estimate-order-estimate process. We demonstrate the
framework on a set of instances assuming Poisson demand.
7.1. Procedure
We approximate the posterior demand distribution given data (St,Ct) with samples generated
using the Metropolis algorithm (see, for example, Gelman et al. 2004, Chapter 11). The Metropolis
algorithm iteratively generates a sequence of samples which, after a suitable warmup period, can
be viewed as samples from the desired posterior distribution. Figure 3 lays out the steps of the
algorithm.
Several implementation decisions are required. We calculate the mean µS and standard deviation
σS of the vector St. For the starting distribution, we use a normal distribution with mean µS and
20
standard deviation σS. For the jumping distribution, we use a normal distribution with mean θi−1
and standard deviation 0.05σS. This yields average “acceptance rate” 1− ρ between 0.30 and 0.50
for most of the instances we tried. (There is some theoretical basis for targeting an acceptance rate
around 0.44. See Gelman et al. 2004.) We approximated πt(θ|St,Ct) by 5000 samples generated
after discarding the first 1000 samples.
Given this Metropolis implementation, we use the following procedure to approximate fixed
points of the estimate-order-estimate process. For each potential integer order quantity j (from a
range of potential order quantities), we use the true demand mean θ to generate a sample path of
length T = 5000 periods, yielding values for (ST ,CnT ). Using (St,C
nT ) as inputs to the Metropolis
algorithm, we generate N = 5000 samples {θi}i=1,...,N with density πT (·|ST ,CnT ) and use these
samples to estimate the posterior mean θ(j) = 1N
∑N
i=1 θi. If j equals the critical fractile quantity
given mean demand θ(j), then we declare that j is (approximately) a fixed point of the estimate-
order-estimate process. In cases in which we find multiple fixed points for the same set of model
primitives, we report the largest of the fixed points in what follows. For large error variances and
very small target critical fractiles, we were not able to find fixed points. Simulations suggest that
the process may not converge to nonzero demand estimates for these instances.
7.2. Adapting the Heuristic Prescription
There is a straightforward adaptation of the heuristic prescription introduced in Section 6.3 to the
general case. When F (d|θ) is discrete, we modify the update of equation (1) to
πht (θ|St,Ct)∝ π0(θ)t∏
τ=1
(1−F (sτ − 1|θ))cτ/Pr{ε≤0}f(sτ |θ)1−cτ/Pr{ε≤0}. (12)
The exponent 1− cτ/Pr{ε≤ 0} may be negative and therefore hard to interpret, but this does not
cause computational issues. The density πht (θ|St,Ct) remains a valid density, and the Metropolis
algorithm can be applied to any valid density. Therefore, the procedure proposed in Section 7.1
can also be used to estimate fixed points of the heuristic seller’s process.
7.3. Illustration for Poisson Demand
We present two sets of results generated using this procedure for repeated newsvendors. In both
examples, the true demand distribution is Poisson with mean 20. The errors each period are given by
independent symmetric Skellam distributions (i.e., the distribution of the difference of two Poisson
random variables with the same means) in Figure 4(a), and by independent Poisson distributions
in Figure 4(b). The figures show both the naive and heuristic sellers’ fixed point demand means
for various error variances and target critical fractiles γ.
21
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 115
16
17
18
19
20
21
γ
θ
naive : λ+ = 1, λ− = 1
naive : λ+ = 4, λ− = 4
heur. : λ+ = 1, λ− = 1
heur. : λ+ = 4, λ− = 4
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 115
16
17
18
19
20
21
γ
θ
naive : λ+ = 0.5, λ− = 0
naive : λ+ = 2, λ− = 0
heur. : λ+ = 0.5, λ− = 0
heur. : λ+ = 2, λ− = 0
Figure 4 Steady state demand estimates of the naive and heuristic sellers (a) when errors follow Skellam(λ+, λ−)
distributions and (b) when errors follow Poisson(λ+) distributions.
We see that both naive and heuristic sellers underestimate the true demand mean θ= 20 just as
in Section 5, and the biases are largest when demand errors are most substantial and when target
service levels are smallest. We see that the bias depends on the shape of the error distribution
particularly through the statistic Pr{ε ≤ 0}. For example, the bias is much more significant for
Poisson(2) errors (variance 2, Pr{ε≤ 0}= 0.135) than for Skellam(1,1) errors (variance 2, Pr{ε≤
0}= 0.654). Finally, we see that the heuristic eliminates a substantial portion, but not all, of the
bias. In short, we find that the results under Poisson demand are aligned with those we saw under
exponential demand.
8. Concluding Remarks
We conclude that ignoring inventory record inaccuracy when trying to learn demand from cen-
sored observations can have a deleterious impact on a firm’s operations, leading to systematic
underestimation of demand that in turn impact the true service levels experienced by the firm’s
customers. Even as this happens, the service levels may appear to the firm to exceed targets. We
find that the firm’s demand underestimation is largest when service levels are relatively low and
when there is a significant probability of positive (i.e., stock-reducing) errors. We suggest a simple
heuristic correction that makes use of the probability of positive errors (equivalently, the probabil-
ity of non-positive errors). Our work suggests a new component of the value of inventory tracking
technologies. In addition, we complement existing literature on inventory record inaccuracy by
revealing another incentive for firms to systematically understand their inventory errors.
Our work leaves several theoretical questions unanswered. While the fixed points we have iden-
tified in our analyses seem to represent steady-state behavior in our numerical studies, we have
22
side-stepped a theoretical study of convergence. In addition, we leave open the study of optimal
stocking policies in the presence of demand learning and inventory errors. In this vein, a reasonable
question is whether the “stock more” results of Lariviere and Porteus (1999) and Ding et al. (2002)
carry over to the setting with inventory errors.
Finally, some work would be required to apply our ideas to practical settings. While the Bayesian
model we have followed is rigorous and convenient for analysis, it would be interesting to investigate
to what extent our results extend to non-parametric settings or to estimation paradigms used in
practice. We are aware of several retailers who account for censoring via extrapolation schemes.
For example, if in a seven-day week the firm believes that the customer-available inventory was
zero for two of the days, then the firm multiplies observed sales by 7/5 to approximate the week’s
demand. We leave it for future work to understand the impact of inventory inaccuracy and to
invent effective prescriptions in such settings.
Acknowledgments
The author thanks Vinayak Deshpande and Linus Schrage for comments on a draft version of this manuscript.
The author thanks the University of North Carolina Kenan-Flagler Business School for financial support.
References
Agrawal, N., S. Smith. 1996. Estimating negative binomial demand for retail inventory management with
unobservable lost sales. Nav. Res. Log. 43 839–861.
Atali, A., H. Lee, O. Ozer. 2011. If the inventory manager knew: Value of visibility and RFID under imperfect
inventory information. Stanford University. Working paper.
Bensoussan, A., M. Cakanyildirim, A. Royal, S. Sethi. 2009a. Bayesian and adaptive controls for a newsvendor
facing exponential demand. Risk and Decision Analysis 1 197–210.
Bensoussan, A., M. Cakanyildirim, S. Sethi. 2007. Partially observed inventory systems: The case of zero
balance walk. SIAM J. Contr. Opt. 46(1) 176–209.
Bensoussan, A., M. Cakanyildirim, S. Sethi. 2009b. A note on ‘the censored newsvendor and the optimal
acquisition of information’. Oper. Res. 57(3) 791–794.
Besbes, O., A. Muharremoglu. 2012. On implications of demand censoring in the newsvendor problem.
Columbia University. Working paper.
Bisi, A., M. Dada, S. Tokdar. 2011. A censored-data multiperiod inventory problem with newsvendor demand
distributions. Manufacturing Service Oper. Management 37(11) 1390–1405.
Braden, D. J., M. Freimer. 1991. Informational dynamics of censored observations. Management Sci. 37(11)
1390–1405.
23
Burnetas, A., C. Smith. 2000. Adaptive ordering and pricing for perishable products. Oper. Res. 48(3)
436–443.
Cachon, G. P., A. G. Kok. 2007. How to (and how not to) estimate the salvage value in the newsvendor
model. Manufacturing Service Oper. Management. 9(3) 276–290.
Camdereli, A. Z., J. M. Swaminathan. 2010. Misplaced inventory and radio-frequency identification (RFID)
technology: Information and coordination. Production Oper. Management 19(1) 1–18.
Chen, L. 2010. Bounds and heuristics for optimal Bayesian inventory control with unobserved lost sales.
Oper. Res. 58(2) 396–413.
Chen, L., E. L. Plambeck. 2008. Dynamic inventory management with learning about the demand distribu-
tion and substitution probability. Manufacturing Service Oper. Management. 10(2) 236–256.
Conrad, S. A. 1976. Sales data and the estimation of demand. Operational Res. Quart. 27(1) 123–127.
Cooper, W. L., T. Homem-de-Mello, A. J. Kleywegt. 2006. Models of the spiral down effect in revenue
management. Oper. Res. 54(5) 968–987.
Cooper, W. L., T. Homem-de-Mello, A. J. Kleywegt. 2009. Learning and pricing with models that do not
explicitly incorporate competition. Working paper, University of Minnesota.
DeHoratius, N., A. Mersereau, L. Schrage. 2008. Retail inventory management when records are inaccurate.
Manufacturing Service Oper. Management. 10(2) 257–277.
DeHoratius, N., A. Raman. 2008. Inventory record inaccuracy: An empirical analysis. Management Sci.
54(4) 627–641.
DeHoratius, N., Z. Ton. 2009. The role of execution in managing product availability. N. Agrawal, S. A.
Smith, eds., Retail Supply Chain Management . Springer Science+Business Media, LLC, 53–77.
Ding, X., M. Puterman, A. Bisi. 2002. The censored newsvendor and the optimal acquisition of information.
Oper. Res. 50(3) 517–527.
Gaukler, G. M., R. W. Seifert, W. H. Hausman. 2007. Item-level RFID in the retail supply chain. Production
Oper. Management 16(1) 65–76.
Gelman, A., J. B. Carlin, H. S. Stern, D. B. Rubin. 2004. Bayesian Data Analysis. 2nd ed. Chapman &
Hall/CRC.
Gruen, T. W., D. Corsten, S. Bharadwaj. 2002. Retail out of stocks: A worldwide examination of causes,
rates, and consumer responses. Washington, D.C.: Grocery Manufacturers of America.
Harpaz, G., W. Lee, R. Winkler. 1982. Learning, experimentation, and the optimal output decisions of a
competitive firm. Management Sci. 28 589–603.
Heese, H. S. 2007. Inventory record inaccuracy, double marginalization, and RFID adoption. Production
Oper. Management 16(5) 542–553.
24
Huh, W. T., R. Levi, P. Rusmevichientong, J. B. Orlin. 2011. Adaptive data-driven inventory control with
censored demand based on Kaplan-Meier estimator. Oper. Res. 59(4) 929–941.
Huh, W. T., P. Rusmevichientong. 2009. A non-parametric asymptotic analysis of inventory planning with
censored demand. Math. Oper. Res. 34(1) 103–123.
Iglehart, D. L., R. C. Morey. 1972. Inventory systems with imperfect asset information. Management Sci.
18(8) B338–B394.
Kang, Y., S. Gershwin. 2005. Information inaccuracy in inventory systems: Stock loss and stockout. IIE
Trans. 37 843–859.
Kok, A. G., K. Shang. 2007. Inspection and replenishment policies for systems with inventory record inac-
curacy. Manufacturing Service Oper. Management 9(2) 185–205.
Kok, A. G., K. Shang. 2010. Evaluation of supply chains with inventory record inaccuracy and implications
on RFID investments. Duke University. Working paper.
Lariviere, M., E. Porteus. 1999. Stalking information: Bayesian inventory management with unobserved lost
sales. Management Sci. 45(3) 346–363.
Le Cam, L. 1986. Asymptotic Methods in Statistical Decision Theory . Springer-Verlag.
Lee, H., O. Ozer. 2007. Unlocking the value of RFID. Prod. Oper. Management 16(1) 40–64.
Lee, S., T. Homem de Mello, A. Kleywegt. 2009. Newsvendor models with decision-dependent uncertainty.
Northwestern University. Working paper.
Lu, X., J.-S. Song, K. Zhu. 2005. On ‘the censored newsvendor and the optimal acquisition of information’.
Oper. Res. 53(6) 1024–1027.
Lu, X., J.-S. Song, K. Zhu. 2007. Inventory control with unobservable lost sales and Bayesian updates. Duke
University. Working paper.
Lu, X., J.-S. Song, K. Zhu. 2008. Analysis of perishable-inventory systems with censored demand data. Oper.
Res. 56(4) 1034–1038.
Mersereau, A. J. 2011. Information-sensitive replenishment when inventory records are inaccurate. Forth-
coming in Production Oper. Management.
Nahmias, S. 1994. Demand estimation in lost sales inventory systems. Naval Res. Log. 41 739–757.
Rekik, Y., E. Sahin, Y. Dallery. 2008. Analysis of the impact of RFID technology on reducing product
misplacement errors at retail stores. Int. J. Prod. Econom. 112(1) 264–278.
Ton, Z., A. Raman. 2006. Cross sectional analysis of phantom products at retail stores. Harvard University.
Working paper.
Wecker, W. E. 1978. Predicting demand from sales data in the presence of stockouts. Management Sci.
24(10) 1043–1054.
25
A. Proof of Proposition 2
The result is easy to see once we establish two lemmas.
Lemma 5. Under the assumptions of Proposition 2, Pr{cnτ = 1}−Pr{crτ = 1}= E [cnτ − crτ ]≤ η where
η < 0 is a constant with respect to jτ .
Proof. We consider two cases. First, assume Pr{ε < 0}= 0 so that Pr{jτ ≥ iτ}= 1 and case (C) of
Table 1 occurs with probability zero. Under this assumption, Table 1 implies that we can write
Pr{cnτ = 1}−Pr{crτ = 1} = −Pr{jτ >dτ ≥ iτ}−Pr{dτ ≥ jτ > iτ}
= −Pr{dτ ≥ jτ > iτ}
≤ −Pr{dτ ≥ jτ}Pr{ετ > 0}
≤ −Pr{dτ ≥M}Pr{ετ > 0}
= −Pr{ετ > 0}e−θg(M),
where the first inequality relies on the independence of dτ and iτ given jτ .
Second, assume Pr{ε < 0}> 0. The first inequality below is shown in the proof of Proposition 1.
Pr{cnτ = 1}−Pr{crτ = 1} ≤ Pr{dτ ≥ jτ ≥ iτ}−Pr{dτ ≥ iτ}
= Pr{dτ > jτ ≥ iτ}−Pr{dτ ≥ iτ > jτ}−Pr{jτ >dτ ≥ iτ}−Pr{dτ ≥ jτ ≥ iτ}
≤ −Pr{dτ ≥ iτ > jτ}
= −Pr{dτ ≥ iτ |iτ > jτ}Pr{ετ < 0}
≤ −Pr{dτ ≥M − ετ |iτ > jτ}Pr{ετ < 0}
≤ −(1/2)Pr{dτ ≥M −m|iτ > jτ}Pr{ετ < 0}
= −(1/2)Pr{ετ < 0}e−θg(M−m),
where m is the median of the distribution of ε|ε < 0.
Lemma 6. Under the assumptions of Proposition 2, limt→∞Pr{∑t
τ=1(cnτ − crτ )< 0}
= 1
Proof. Consider the random variables
x(τ) = cnτ − crτ
X(t) =t∑
τ=1
(cnτ − crτ ) =t∑
τ=1
x(τ)
Note that x(τ) is defined on {−1,0,1} and therefore its variance v(τ) is upper bounded by 4.
Because the xτ ’s are independent of each other conditional on j1, . . . , jt, therefore
V (t)≡Var (X(t)) =t∑
τ=1
v(τ)≤ 4t.
26
By Lemma 5 we have that
X(t)≡Mean(X(t))≤ ηt < 0.
The one-tailed Chebyshev inequality applied to X(t), yields
Pr{X(t)− X(t)≥ k
√V (t)
}≤ 1
1 + k2for any k > 0.
Specifically, set k=−X(t)/√V (t), then
Pr{X(t)≥ 0}= Pr{X(t)−X(t)≥−X(t)} ≤(
1 +(X(t))2
V (t)
)−1
≤(
1 +η2t2
4t
)−1
=
(1 +
η2t
4
)−1t→∞−→ 0.
To complete the proof of Proposition 2, it remains to show that∑t
τ=1(cnτ − crτ ) =∑t
τ=1(1− crτ )−∑t
τ (1− cnτ ) < 0 implies that the tracking seller’s predictive distribution stochastically dominates
that of the naive seller. Because the naive and tracking sellers stock the same quantities and are
subject to the same sequences of demands and errors, we have srτ = snτ for all τ and therefore∑t
τ=1 srt =
∑t
τ=1 snt . The desired result then follows directly from Proposition 6.3 of Braden and
Freimer (1991).
B. Analysis of Equations (8) and (9)
We begin with some definitions. We define for conciseness of notation:
P = Pr{ε≤ 0}
M = Mε(θ)
q(θ) = exp
{θ
θln(1− γ)
}q′(θ) = exp
{θ
θln(1− γ)
}(− θ
θ2ln(1− γ)
)= q(θ)
(− θ
θ2ln(1− γ)
).
Throughout this section we assume Mε(θ)> 1, for which it is sufficient to have Pr{ε= 0}< 1 and
E[ε]≥ 0. With this notation, we can write the right-hand side of equation (9) as
R(θ) =1−Pr{ε≤ 0} exp
{θ
θln(1− γ)
}1−Mε(θ) exp
{θ
θln(1− γ)
} =1−Pq(θ)1−Mq(θ)
.
We study R(θ) over the range (0, θ), where θ = θ ln(1−γ)
ln(1/M). This range corresponds with 0 < 1−
Mq(θ)< 1. We first present a lemma establishing the shape of R(θ).
Lemma 7. Suppose that E[ε] ≥ 0. Then R(θ) is convex and non-decreasing on (0, θ) with
limθ→θR(θ) = +∞ and limθ→0+ R(θ) = 1.
27
Proof. An argument very similar to one in the proof of Proposition 3 gives us that Mε(θ) ≥ 1.
Therefore R(θ)≥ 1 for all θ ∈ (0, θ). The first limit follows because 1−Mq(θ) = 0, and the second
limit follows because limθ→0+ q(θ) = 0.
We next show that R(θ) is nondecreasing on (0, θ). R(θ) is continuous with derivative
dR(θ)
dθ=
(1−Mq(θ)(−Pq′(θ))− (1−Pq(θ))(−Mq′(θ)
[1−Mq(θ)]2
=q′(θ)
[PMq(θ)−P +M −PMq(θ)
][1−Mq(θ)]2
=q′(θ)[M −P ]
[1−Mq(θ)]2≥ 0
because M ≥ 1≥ P and because q′(θ)≥ 0.
Finally, we show convexity. Letting A=−θ ln(1− γ)[M −P ] and noting that A≥ 0, we have
d2R(θ)
dθ= A
θ2[1−Mq(θ)]2q′(θ)− q(θ)[2θ[1−Mq(θ)]2 + 2θ2[1−Mq(θ)](−Mq′(θ))
]θ4[1−Mq(θ)]4
= Aθ[1−Mq(θ)]q′(θ)− 2q(θ)[1−Mq(θ)] + 2Mθq′(θ)q(θ)
θ3[1−Mq(θ)]3
=Aq(θ)
θ3[1−Mq(θ)]3
{[1−Mq(θ)]
(−θθ
ln(1− γ)− 2
)+ 2θMq′(θ)
}=
Aq(θ)
θ3[1−Mq(θ)]3
{−2 +M
[− θ
Mθln(1− γ) + q(θ)
(−θθ
ln(1− γ) + 2
)]}Showing convexity is equivalent to showing that the term in curly brackets is greater than or equal
to zero. Define z(y) = θyM
+ e−θy[θy+ 2] so that z(− 1
θln(1− γ)
)equals the term in square brackets.
We show that z(y)≥ 2/M for all y ≥ 0, which implies that the term in curly brackets is greater
than or equal to zero. Note that limy→0+ z(y) = 2≥ 2/M and limy→∞ z(y) = +∞. Because z(y) is
differentiable on (0,+∞), we can check miny≥0 z(y)≥ 2/M by verifying z(y∗)≥ 2/M at all positive
stationary points y∗ of z. All positive stationary points must satisfy
z′(y∗) =θ
M− θe−θy
∗[θy∗+ 2] + θe−θy
∗= 0,
or equivalently,
e−θy∗
=1
M
1
1 + θy∗.
Therefore z(y∗) = θy∗
M+ 1
M1
θy∗+1[θy∗+ 2] = 1
M
[1 + θy∗+ 1
1+θy∗
]≥ 2/M because the sum of a positive
number and its reciprocal is greater than or equal to 2.
Lemma 7 implies that a solution to equation (9) is the intersection of a convex function R(θ)
and a linear function L(θ)≡ θ/θ. Therefore equation (9) has at most two solutions on (0, θ). The
following proposition gives a sufficient condition for equation (9) to have two solutions.
28
Proposition 8. Assuming E[ε]≥ 0, a solution θ ∈ (0, θ) to equation (9) exists if and only if
minα∈(0,1)
− ln(
αMε(θ)
)(1− Pr{ε≤0}
Mε(θ)α)
1−α<− ln(1− γ).
Proof. By Lemma 7, R(θ) is greater than L(θ) as θ→ 0+ and as θ→ θ, therefore equation (8)
having two solutions is equivalent to there existing a θ ∈ (0, θ) such that R(θ)<L(θ), i.e.,
θ
θ>
1−P exp{(θ/θ) ln(1− γ)}1−M exp{(θ/θ) ln(1− γ)}
.
Noting that 0<M exp{(θ/θ) ln(1−γ)}< 1 for θ ∈ (0, θ), we can define α≡M exp{(θ/θ) ln(1−γ)}
and reframe the problem as seeking an 0<α< 1 such that
ln(1− γ)
ln(α/M)>
1−Pα/M1−α
.
A simple algebraic manipulation shows that this is equivalent to the desired condition.
Proposition 9. The existence of solutions to equation (9) is sufficient for the existence of a
solution to equation (8).
Proof. We have Mr(θ, j)→Mε(θ) as θ→ 0+ (equivalently, as j→+∞). Therefore, the right-hand
side of equation (8) has the limit 1 as θ→ 0+. The result follows from the continuity of the right-
hand side of (8) and the fact that the right-hand side of (8) is less than or equal to R(θ) for all
θ ∈ (0, θ) because Mr(θ|j)≤Mε(θ).
In all of our numerical studies of equation (9), we have found one solution to be in the vicinity
of the demand mean and one to be close to θ. We have observed convergence to consistently be to
the former solution. For the latter solution, the corresponding j typically yields Pr{j− ε < 0} that
is considerable, suggesting that equation (9) is probably a poor approximation of equation (8) in
the vicinity of this solution. The following result establishes that, for the solutions of interest, the
θ solution of equation (9) upper bounds the θ solution of equation (8). As discussed in Section 5.2,
the difference between these two solutions is negligible for the instances we consider.
Proposition 10. Assume E[ε]≥ 0 and let θ8 and θ9 indicate the smallest θ solutions of equations
(8) and (9) respectively, assuming they both exist. Then θ9 ≥ θ8.
Proof. Let R(θ) indicate the right-hand side of equation (9) as previously, and let R8(θ) indicate
the right-hand side of equation (8) as a function of θ.
The desired result follows because both R(θ) and R8(θ) intersect θ/θ from above at their left-most
intersections, and because Mr(θ, j)≤Mε(θ) implies that R8(θ)≤R(θ) for θ ∈ (0, θ).