Demand Estimation from Censored Observations with ......The second research stream most relevant to our work is the literature on inventory record. 4 inaccuracy and inventory misplacement.1

Demand Estimation from Censored Observationswith Inventory Record Inaccuracy

Adam J. MersereauKenan-Flagler Business School, University of North Carolina

[email protected]

A retailer cannot sell more than it has in stock; therefore its sales observations are a censored representation

of the underlying demand process. When a retailer or other firm forecasts demand based on past sales

observations, the firm requires an estimation approach that accounts for this censoring. Several authors have

analyzed demand learning in censored environments, but they assume that inventory levels are known and

hence that stockouts are observed. However, for a variety of reasons, firms often do not know how many units

are available to meet demand. In this paper, we investigate the impact of this unknown on demand estimation

and subsequent stocking levels in a censored environment. When the firm does not account for inventory

uncertainty when estimating demand, we discover a systematic downward bias in demand estimates. We

propose and test a heuristic prescription that relies on a single error statistic and sharply reduces this bias.

Our work identifies and measures a new component of the value of tracking technologies such as RFID.

This version : April 17, 2012.

1. Introduction

There has been a surge of interest in recent years in the problem of “inventory record inaccuracy.”

For a variety of reasons, firms’ electronic inventory records do not necessarily match the quantity of

customer-available stock. This uncertainty brings several costs (DeHoratius et al. 2008). Firms must

deploy resources to count stock and reconcile inventory records. Uncertain inventory records hinder

the firm’s ability to match supply with demand, leading to higher inventory and stockout costs.

The current paper exposes another source of cost due to record inaccuracy, namely its confounding

of the firm’s demand estimation. In this way, our findings are relevant for the evaluation of tracking

technologies such as Radio Frequency Identification (RFID).

While the majority of stochastic inventory models in the academic literature assume that the

probability distribution governing demand is fixed and known, the reality is that firms’ forecasts

of demand are perpetually evolving as data is accumulated. In many settings (e.g., retail) the firm

observes sales data that are censored representations of demand. That is, sales in a period are the

minimum of demand in the period and the amount of stock available. Sales are always less than or

equal to demand, and Conrad (1976) demonstrates that a firm who assumes that sales and demand

1

2

are equivalent will stock its shelves suboptimally. To correctly learn demand from sales data, the

firm requires an estimation approach that properly accounts for the censoring. But it is difficult for

a firm to properly account for censoring when erroneous inventory records prevent the firm from

accurately assessing when censoring occurs.

In this paper, we investigate the impact of inventory record inaccuracy on demand estimation and

subsequent stocking levels in a censored environment. The firm chooses a stocking level each period

in the face of an unknown demand distribution. We assume that the firm is sophisticated with

respect to demand estimation, and we borrow the parametric Bayesian demand learning models of

Lariviere and Porteus (1999) and Ding et al. (2002) to account for censoring. However, most of our

analysis assumes that the firm does not account for record inaccuracy when estimating demand.

This appears to be common practice. While we are aware of several retailers who use methods to

correct for censoring, it seems uncommon for retailers to take record inaccuracy into account when

estimating demand. We examine the estimation biases resulting from such a mischaracterization,

and we characterize fixed points of this misspecified estimation process. We discover a systematic

downward bias in demand estimates, even while service levels may appear to the firm to meet

or exceed targets. An implication is that ignoring record inaccuracy can negatively impact the

profitability of a firm even as the firm may fail to detect the problem.

An effective RFID implementation can potentially mitigate the problem by providing the seller

better visibility into censoring, but at substantial expense. In the absence of an RFID implemen-

tation, we propose a heuristic prescription to the problem that relies on a single inventory error

statistic (namely, the probability of a positive, or stock-reducing, error) and substantially reduces

the estimation bias.

Our work complements existing research by arguing for a better understanding of record inac-

curacy and more efforts towards its mitigation. We point out that much of the academic research

on record inaccuracy and RFID has focused on their implications on replenishment policies. Our

work is relevant to a broader set of firms—for example, fashion retailers who engage in limited

replenishment but for whom learning about demand trends is important.

The paper is organized as follows. We review related literature in Section 2. Section 3 introduces

our notation and model setting and reviews some background on demand learning from censored

observations. We present in Section 4 a basic result suggesting a downward bias in demand esti-

mates under record inaccuracy under fairly general assumptions. In Section 5, we provide a detailed

analysis for the case of exponential demand, where we make use of concise demand updating equa-

tions under censored demand established by Braden and Freimer (1991). Specifically, Section 5.1

3

characterizes fixed points of the firm’s estimation and stocking process given this misspecification.

Numerical results in Section 5.2 lead naturally to a discussion of prescriptions in Section 6. Section

7 presents a numerical framework (including the Metropolis algorithm to perform the Bayes update

of demand beliefs) for studying the estimation bias under discrete demand distributions. Using this

framework, we corroborate many of our earlier findings for the case of Poisson demand. We discuss

future research opportunities in Section 8.

2. Related Literature

Our research lies at the intersection of two significant but distinct research streams. The first

concerns estimating demand and optimizing inventory policies based on censored observations.

Wecker (1978), Nahmias (1994), and Agrawal and Smith (1996) present methods for estimating

demand parameters from sales observations when stocking levels are known. Especially relevant

to our work are parametric Bayesian models for learning demand from censored observations,

which date to Harpaz et al. (1982). Lariviere and Porteus (1999), making use of the “newsvendor

distribution” framework of Braden and Freimer (1991) and assuming that inventory is perishable,

show that an optimal forward-looking seller in a censored-demand setting will stock more than a

myopic seller. Ding et al. (2002) extend this result to a more general set of continuous distributions

(see also notes by Lu et al. 2005, 2008 and Bensoussan et al. 2009b). Our work makes use of the

framework assumed in these papers in that we assume a repeated newsvendor setting with demand

censoring and a parametric Bayesian model for learning demand. We review this model in Section

3.1 and the newsvendor distribution framework in Section 3.2. For the special case of exponential

demand that we assume in Section 5, Bisi et al. (2011) and Bensoussan et al. (2009a) offer detailed

results on policies and estimation consistency.

The results of Lariviere and Porteus (1999) and Ding et al. (2002) have been extended by

several authors. Chen and Plambeck (2008), Lu et al. (2007), Chen (2010), and Bisi et al. (2011)

consider perishable inventory. A number of works, including Burnetas and Smith (2000), Huh

and Rusmevichientong (2009), and Huh et al. (2011), consider learning of censored demand in

non-parametric settings. Besbes and Muharremoglu (2012) establish the theoretical importance of

censoring information through an asymptotic analysis, particularly when demand takes discrete

values. Nevertheless, in all the works we are aware of on learning from censored demand, a key

implicit assumption is that inventory positions are known so that stockuts are accurately observed.

The second research stream most relevant to our work is the literature on inventory record

4

inaccuracy and inventory misplacement.1 Work on inventory management with record inaccuracy

dates to Iglehart and Morey (1972). Recent interest in the problem follows empirical studies of

record inaccuracy and inventory misplacement (e.g., DeHoratius and Raman 2008, Ton and Raman

2006) and accompanies the emergence of new tracking technologies such as RFID (Lee and Ozer

2007). Several analytical studies have looked at the implications of inventory inaccuracies in supply

chain settings (e.g., Heese 2007, Gaukler et al. 2007, Camdereli and Swaminathan 2010, Kok and

Shang 2010), but much of the work in this area concerns inventory and audit policies for single-item,

single-location systems with inventory inaccuracy (Kang and Gershwin 2005, Kok and Shang 2007,

Bensoussan et al. 2007, DeHoratius et al. 2008, Rekik et al. 2008, Atali et al. 2011, Mersereau 2011).

These papers differ from each other largely in the details underlying their inventory, error, and

information models. Many of these papers estimate the value of inventory visibility by comparing a

naive or ignorant seller (who is oblivious to inventory inaccuracies), an informed seller (who knows

the distribution but not the realizations of inventory inaccuracies) and an RFID-enabled seller (who

has full visibility into customer-available stock). The papers then measure the value of visibility as

the improvement in decision-making achieved through better information. For example, the value

of RFID is typically claimed to be the difference in performance between the RFID-enabled seller

and the naive or informed sellers.

These papers differ from our work most significantly in that they all assume that demand prob-

ability distributions (and typically also error distributions) are known. In our model, the demand

distribution is unknown and learned from (censored) observations. We model a naive seller who is

oblivious to the error process. We compare this naive seller with a “tracking” (or RFID-enabled)

seller who observes error realizations, and we propose a heuristic for an informed seller who is

aware of errors and has limited knowledge of their distribution. Our focus is the impact of inventory

record inaccuracy on the seller’s ability to estimate demand, in contrast to previous works that all

measure the impact of inventory record inaccuracy on matching supply with demand given known

demand processes. DeHoratius and Ton (2009) survey the literature on inventory record inaccu-

racy and identify an “opportunity to incorporate execution problems into models that estimate

demand and assess forecast accuracy in the presence of stockouts” (p. 67). We believe we are the

first to investigate this interaction between inventory inaccuracies and demand estimation. Our

1 DeHoratius and Ton (2009) distinguish between “inventory record inaccuracy” (where the quantity of inventory ina firm’s facility does not match the firm’s inventory record) and “misplaced products” (where some fraction of theon-hand inventory is in the facility but unavailable to customers). Our interest is in any situation in which a firm’sinventory records do not reflect customer-available inventory. We model an error process that may be a combinationof both types of errors, and we refer to it as “inventory record inaccuracy” for simplicity.

5

work shows that previous assessments of RFID in the literature may underestimate its value by

assuming known demand distributions.

Much of our paper analyzes a naive seller who estimates demand and makes stocking decisions

while oblivious to the existence of inventory inaccuracies. Thus the naive seller makes an incorrect

assumption about the process generating her observations. A small literature has emerged on

misspecified demand models in operations management (Cachon and Kok 2007, Cooper et al. 2006,

2009). Lee et al. (2009) investigates convergence in a general repeated newsvendor setting in which

the seller stocks according to a critical fractile-based policy and demand subsequently depends on

the stocked quantity. This bears some resemblance to our problem, although the misspecification

in their model involves an endogenous demand-generating process rather than uncertainty around

stocking levels.

3. Preliminaries

In this section we build our model by first reviewing the classical Bayesian censored newsvendor

model, including the “newsvendor distribution” framework, to which we then add a model of

inventory record errors.

3.1. Review of Demand Estimation under Censoring

Here we review the censored newsvendor problem and introduce some notation without the com-

plication of inventory record errors. We will add errors to the model in Section 3.3. We assume

that a seller stocks a quantity jt of units in period t to satisfy random demand. We assume that

the demand distribution is unknown. In particular, we assume that demand dt in period t is drawn

from a stationary probability distribution F (·|θ). The functional form F (·|θ) is known, and θ is a

fixed but unknown parameter. Conditional on θ, demand is independent across periods.

At the end of period t, the seller observes sales st = min{dt, jt}, and unfulfilled demand is

unobserved. Therefore, if the seller observes st < jt then she assumes that dt = st; if the seller

observes st = jt then she assumes that dt ≥ st.

The seller begins with a prior density π0 on θ that she updates in a Bayesian fashion each period

based on sales observations. For the purpose of computing a posterior distribution of θ at period

t, the seller requires not only the history of sales St = (s1, . . . , st) but also a vector Ct = (c1, . . . , ct)

of indicators indicating whether or not there was a stockout in each time period. We let ct = 1 if

st = jt and ct = 0 if st < jt.

6

Given this history, we can compute the posterior density πt(·|St,Ct) of θ at time t. Assum-

ing that demand takes discrete, integer values, let f(·|θ) indicate the probability mass function

corresponding with distribution F (·|θ). The posterior density πt(·|St,Ct) is then:

πt(θ|St,Ct) ∝ π0(θ)t∏

τ=1

[1l{cτ = 1}(1−F (sτ − 1|θ)) + 1l{cτ = 0}f(sτ |θ)]

∝ π0(θ)t∏

τ=1

(1−F (sτ − 1|θ))cτ f(sτ |θ)1−cτ . (1)

If demand instead follows a continuous distribution with density f(·|θ), the posterior density is:

πt(θ|St,Ct) ∝ π0(θ)t∏

τ=1

[1l{cτ = 1}(1−F (sτ |θ)) + 1l{cτ = 0}f(sτ |θ)]

∝ π0(θ)t∏

τ=1

(1−F (sτ |θ))cτ f(sτ |θ)1−cτ . (2)

We denote the predictive distribution (after unconditioning on θ) after period t by Ft(d|St,Ct).

3.2. Review of Newsvendor Distributions

An interesting and tractable set of special cases arises when we assume the demand distribution

falls within the family of “newsvendor distributions” identified by Braden and Freimer (1991)

and further studied by Lariviere and Porteus (1999). Here we briefly review relevant results on

newsvendor distributions.

A newsvendor distribution has the form

F (d|θ) = 1− exp{−θg(d)}, d∈R+

where the function g() is positive, differentiable, and increasing for d≥ 0; also, limd→0+ g(d) = 0 and

limd→∞ g(d) =∞. An example is the exponential distribution, which is a newsvendor distribution

with g(d) = d.

The gamma distribution is a conjugate prior for all newsvendor distributions. If we let a0 and

b0 indicate the shape and scale parameters, respectively, of a gamma prior density on θ,

π0(θ|a0, b0) =ba00 θ

a0−1e−b0θ

Γ(a0),

then the posterior distribution of θ given observations St and Ct is gamma with scale and shape

parameters

at = a0 +t∑

τ=1

(1− cτ ) (3)

7

bt = b0 +t∑

τ=1

g(sτ ). (4)

This gamma posterior has mean at/bt and variance at/b2t , and the seller’s predictive density of

demand in period t+ 1 is

f(dt+1|at, bt) =atb

att g′(dt+1)

[bt + g(dt+1)]at+1. (5)

The parameters at and bt are sufficient statistics for the demand parameter θ in the Bayes

sense. This is convenient for computation. Moreover, this implies that, in the case in which a

newsvendor demand distribution is assumed, the matching of ct with its corresponding st becomes

unimportant. That is, the seller need only keep track of the total number of censored observations

and an aggregate measure of sales without regard to which sales observations were censored and

which were not.

Braden and Freimer (1991) argue that the class of newsvendor distributions is a versatile subset

of continuous distributions. However, there appear not to be analogous results for discrete distri-

butions commonly used to model demand, like the Poisson distribution. Section 7 considers the

case of Poisson demand.

3.3. Model of Inventory Record Inaccuracy

We model inventory record inaccuracy by distinguishing between the quantity jt the seller orders

and the quantity it that is in fact customer-available. A discrepancy between it and jt may arise from

a variety of sources: merchandise theft or damage, recording or counting errors, or misplacement.

We assume that it = {jt− εt}+, where εt is a random variable drawn from an error distribution H.

While in some practical contexts errors may accumulate over time until the firm performs an

audit, resulting in time-dependent errors, we assume for simplicity that the error process is sta-

tionary and that errors are independent across periods. This is a reasonable model for a repeated

newsvendor of perishable goods whose stock is impacted by errors each period before customers

arrive. For expositional simplicity, we also assume that H is non-degenerate such that Pr{εt = 0}<

1. Except where specified otherwise, we allow H to be continuous, discrete, or a mixture of continu-

ous and discrete distributions. We also define the distribution Kjt of “realized” errors rt = jt− it =

min{εt, jt}. It follows that Kjt(r) =H(r) for r < jt, and Kjt(jt) = 1 (because rt = min{εt, jt} ≤ jt).

We have Kjt =H when H is such that Pr{εt > jt}= 0.

For the sake of comparison, we consider two sellers operating with different information sets:

1. A “naive” seller knows jt and thinks that jt units are available to the customer in period t.

That is, the seller ignores errors or is oblivious to them. We believe that this is representative of

8

Case Realization st crt cnt(A) it ≥ jt >dt dt 0 0(B) jt > it >dt dt 0 0(C) it >dt ≥ jt dt 0 1(D) dt ≥ it ≥ jt it 1 1(E) jt >dt ≥ it it 1 0(F) dt ≥ jt > it it 1 0

Table 1 Sales and stockout observations recorded in period t by tracking and naive sellers given order quantity

jt under all possible demand and error realizations.

common practice. We assume that the naive seller maintains a database of sales Snt = (sn1 , . . . , snt )

and of stockout indicators Cnt = (cn1 , . . . , c

nt ), where cnt = 1 if snt ≥ jt and cnt = 0 if snt < jt, and that

she computes posterior densities πnt (·|Snt ,Cnt ) for θ using equations (1) or (2).

We permit εt < 0, in which case it is possible for the naive seller to observe sales exceeding her

order quantity—that is, snt > jt. This is a zero-probability event given her belief that jt is the true

customer-available inventory level. We assume that the seller interprets such an observation as a

stockout with sales st.

2. A “tracking” seller knows the true customer-available inventory level it, which is plausible

under a perfect implementation of a tracking technology such as RFID. We assume that the tracking

seller maintains a database of sales Srt = (sr1, . . . , srt ) and of stockout indicators Cr

t = (cr1, . . . , crt ),

where crt = 1 if srt = it and crt = 0 if srt < it, and that she computes posterior densities πrt (·|Srt ,Crt )

for θ using equation (1) or (2).

We will explore the impact of inaccurate inventory records by comparing the performances of

the naive and tracking sellers. Table 1 shows the observations recorded by the two sellers, broken

down into six cases. In cases (A), (B), and (D), the naive seller registers the same information as

the tracking seller. Case (C) represents the situation in which the naive seller perceives a stockout

even though there is no physical stockout. We term this a “false stockout.” Cases (E) and (F)

represent situations in which there is a physical stockout that goes unseen by the naive seller. We

term this situation a “missed stockout.”

4. Comparison when Stocking Quantities are Common

Our first result is that, if the naive and tracking sellers stock the same quantity in a period,

the naive seller perceives too few stockouts in the period compared with the tracking seller. This

initial result requires no assumptions on the demand distribution and holds under a plausible set

of conditions on the error distribution H. We subsequently will establish the implications of this

result in the more restrictive setting in which demand follows a newsvendor distribution.

9

Proposition 1. Suppose the error distribution H is such that Pr{ε > 0} ≥Pr{ε < 0} and suppose

that the naive and tracking sellers stock the same quantity j1 in period 1. Then Pr{cn1 = 1}−Pr{cr1 =

1} ≤ 0.

Proof. Observe from Table 1 that

Pr{cn1 = 1}−Pr{cr1 = 1} = Pr{i1 >d1 ≥ j1}+ Pr{d1 ≥ i1 ≥ j1}−Pr{d1 ≥ i1}

= Pr{(d1 ≥ j1)∩ (i1 ≥ j1)}−Pr{d1 ≥ i1}

= Pr{d1 ≥ j1}Pr{ε1 ≤ 0}−Pr{d1 ≥ i1}

≤ Pr{d1 ≥ j1}Pr{ε1 ≥ 0}−Pr{d1 ≥ i1}

= Pr{d1 ≥ j1 ≥ i1}−Pr{d1 ≥ i1}

≤ Pr{d1 ≥ i1}−Pr{d1 ≥ i1}

= 0,

where the third equality relies on the independence of d1 and i1 given j1, and the first inequality

makes use of the assumption Pr{ε > 0} ≥Pr{ε < 0}

The condition Pr{ε > 0} ≥Pr{ε < 0} of Proposition 1 is consistent with the audit results observed

by DeHoratius and Raman (2008), who found positive discrepancies in 59% of inaccurate records

identified in a retail chain’s inventory audit. Furthermore, this condition covers the error assump-

tions made in much of the analytical literature on inventory record inaccuracy. It subsumes non-

negative errors (i.e., Pr{ε≥ 0}= 1), which are reasonable to assume for certain error sources—for

example, theft, damage, or misplacement. This assumption is made in Kang and Gershwin (2005)

and Kok and Shang (2010). Proposition 1 also applies to error distributions that are symmetric

about zero, which is a plausible model of clerical errors and assumed by Kok and Shang (2007).

Finally, Proposition 1 also applies to mixtures of nonnegative and symmetric error distributions,

which is an appropriate model when multiple sources of errors are assumed (e.g., Atali et al. 2011).

If the naive seller is seeing the same sales observations St and too few stockouts compared

with the tracking seller, it is intuitive that the naive seller would therefore underestimate demand

relative to the tracking seller. This is challenging to show in general because the sellers’ beliefs

generally involve the joint distributions of (st, crt ) and (st, c

nt ), which can be complex. However, if

we work with newsvendor distributions, the sellers’ beliefs depend only on the aggregate statistics∑t

τ=1 sτ ,∑t

τ=1(1− crτ ), and∑t

τ=1(1− cnτ ). This enables the following result.

Proposition 2. Suppose that

a) the error distribution H is such that Pr{ε > 0} ≥Pr{ε < 0} (and Pr{ε= 0}< 1),

10

b) demand follows a newsvendor distribution with known g() and unknown parameter θ > 0, on

which the sellers share an arbitrary, non-degenerate prior distribution with well-defined density,

and

c) the naive and tracking sellers stock the same fixed, bounded sequence of quantities 0< jt <M

and experience the same realization of demands and errors.

Then with probability approaching one as t→∞, the tracking seller’s predictive distribution of

demand Ft(·|St,Crt ) will first-order stochastically dominate the naive seller’s predictive distribution

of demand Ft(·|St,Cnt ).

We present the proof in Appendix A. The proof relies on the observation that, if the two sellers

stock the same quantities, they see the same sequence of sales observations. The proof has three

distinct parts. First, we show that the difference Pr{cnt = 1} − Pr{crt = 1} is bounded away from

zero. Second, we show that this implies Pr{∑t

τ=1(1− crτ )<∑t

τ=1(1− cnτ )}→ 1. The stochastic

ordering of the predictive distributions then follows from a result in Braden and Freimer (1991).

Propositions 1 and 2 rely on the assumption that the naive and tracking seller stock the same

sequence of stocking quantities. In practice, however, we would expect the two sellers to stock

quantities based on their evolving beliefs about demand. Although a naive seller underestimates

demand relative to a tracking seller for the same stocking quantity, the naive seller will increasingly

perceive stockouts as her stocking levels decrease. In the following sections, we study the conver-

gence of this process and the magnitude of the naive seller’s bias in steady state. Section 5 pursues

these goals analytically for the special case of exponential demand, and we present a numerical

procedure in Section 7 for discrete distributions.

5. Exponential Demand

The previous section compared the observations made by the naive and tracking sellers given a

common sequence of order quantities. In this section, we assume that the naive seller chooses an

order quantity jt in period t that depends on her own demand estimates. Therefore she engages

in a process of estimate-order-estimate. We seek to characterize fixed points of such a process.

Assuming demand follows an exponential distribution F (d|θ) = 1− exp{−θd} for some θ > 0, we

express a fixed point of the estimate-order-estimate process as the solution to a single equation in

Section 5.1, which we use to gain insights in Section 5.2.

5.1. Characterizing Fixed Points

Here we assume a repeated newsvendor setting in which inventory perishes between periods. We

seek fixed points of the naive seller’s stocking and estimation process. That is, we seek pairs (θ, j)

11

such that (1) if the seller’s demand estimate is fixed at θ > 0 then the seller will stock j > 0, and

(2) if the seller stocks j repeatedly then the seller’s demand parameter estimate converges to θ.

We will numerically demonstrate in Section 5.2 that the fixed points we characterize here describe

steady state behavior of the naive seller’s estimate-order-estimate process.

We assume that demand follows an exponential distribution (which is a newsvendor distribution),

and we assume that the seller has a gamma(a0, b0) prior on θ. This is to simplify our analysis,

although the Bernstein-von Mises theorem (see Le Cam 1986) suggests that the choice of prior is

irrelevant as the number of observations becomes large and thus has no bearing on our results. We

also assume that, given an estimate θ of the demand parameter, the naive seller employs a critical

fractile policy, choosing j to achieve a particular pre-defined service level γ ∈ (0,1) with respect to

the assumed exponential demand distribution F (·|θ), i.e.,

j =−1

θln(1− γ),

which follows directly from setting γ equal to the cdf of the exponential distribution with rate θ.

This is an optimal stocking quantity for a newsvendor with known θ= θ and no inventory errors;

therefore it specifies reasonable limiting behavior of a policy that is learning θ while oblivious to

errors.

Let the random variables i, s and cn represent customer-available inventory, sales, and naive

seller stockout indicator, respectively, if the seller stocks j. Given stocking quantity j, the naive

seller perceives a stockout with probability

E[cn] = Pr{i≥ j}Pr{d≥ j}= Pr{r≤ 0} exp{−θj}= Pr{ε≤ 0} exp{−θj} (6)

and sees average sales

E[s] =

∫ ∞−∞

Ed[min{d, j− r}]dKj(r).

Making use of the memoryless property of the exponential distribution,

Ed[min{d, j− r}] =1

θ(1− exp{−θ(j− r)}),

which yields

E[s] =1

θ− 1

θ

∫ ∞−∞

exp{−θ(j− r)}dKj(r)

=1

θ− 1

θexp{−θj}Mr(θ, j), (7)

where Mr(θ, j)≡ Er[eθr|j] is the moment-generating function of the random variable r evaluated

at θ. We assume this function exists at θ for all j > 0.

12

Suppose the naive seller stocks j each period and that E[j − r]> 0 so that exp{−θ(j − r)}< 1,

then the expected sales E[s] each period is constant and positive and bt/t converges almost surely to

this expectation by the strong law of large numbers. This enables us to show that the naive seller’s

posterior belief converges to the value E[1 − cn]/E[s]. In particular, letting θt denote a random

variable distributed according to the seller’s posterior belief at time t,

E[θt] → limt→∞

(1/t)at(1/t)bt

=limt→∞(1/t)atlimt→∞(1/t)bt

=limt→∞(1/t)

∑t

τ=1(1− cnτ )

limt→∞(1/t)∑t

τ=1 sτ=

E[1− cn]

E[s],

Var[θt] → limt→∞

(1/t2)at(1/t2)b2

t

=E[1− cn]

E[s]limt→∞

1/t

(1/t)bt= 0.

Hence, if the naive seller repeatedly stocks j > 0, her demand parameter estimate converges to

θ= E[1− cn]/E[s].

Substituting the expressions (6) and (7) into θ= E[1− cn]/E[s], we find that a fixed point (θ, j)

must satisfyθ

θ=

1−Pr{ε≤ 0} exp{−θj}1−Mr(θ, j) exp{−θj}

, where j =−1

θln(1− γ). (8)

A solution to equation (8) will have θ ≥ θ if Mr(θ, j)≥ Pr{ε≤ 0} The following proposition gives

an interpretable sufficient condition for this to be true.

Proposition 3. If equation (8) has a finite solution (θ, j) such that E[r|j]≥ 0, then θ≥ θ.

Proof. Let ξ ≡ E[r|j], then

Mr(θ, j) = E[eθr|j

]= eθξE

[eθ(r−ξ)|j

]≥ E

[eθ(r−ξ)|j

]≥ E

[e0]

= 1,

where the first inequality follows from the assumption E[r|j] = ξ ≥ 0 and the fact that eθx is an

increasing function of x when θ > 0, and the second inequality follows from Jensen’s inequality.

Because Pr{ε≤ 0} ≤ 1, we haveMr(θ, j)≥Pr{ε≤ 0} and therefore the right-hand side of equation

(8) is greater than zero. (The finiteness of θ implies that 1−Mr(θ, j) exp{−θj}). Therefore θ≥ θ.

Recall that the parameter (rate) of an exponential distribution is the reciprocal of its mean.

Therefore, θ≥ θ implies that the seller underestimates demand.

Equation (8) is challenging to analyze further (and even to solve numerically) because Mr(θ|j)

depends on the error distribution and θ (through j) in a complex way. This motivates our study

of an important special case of equation (8). In many settings we expect that errors will be small

13

relative to the quantity of inventory stocked. If we restrict ourselves to fixed points where Pr{ε >

j}= 0 then we can replace Mr(θ|j) with Mε(θ)≡ E [eθε], which is a constant with respect to θ and

j. This yieldsθ

θ=

1−Pr{ε≤ 0} exp{−θj}1−Mε(θ) exp{−θj}

, where j =−1

θln(1− γ). (9)

Even when Pr{ε > j} is greater than zero but small (i.e., when γ is relatively large and errors are

small relative to demand), we have found this to be a very good approximation. Furthermore, it is

easy to see that Mr(θ|j)≤Mε(θ) for all j, and a simple modification to the proof of Proposition

3 shows that if E[ε]≥ 0 then any solution θ to equation (9) satisfies θ≥ θ. The appendix provides

a more detailed analysis of equations (8) and (9), including conditions for the existence of their

solutions and a comparison of their solutions.

5.2. Numerical Evaluation

In this section, we compute solutions to the equations derived in the previous subsection. Our pur-

pose is to understand the magnitude and sensitivity of the estimation bias suggested by Proposition

3.

For all results reported in this subsection, we assume θ= 1/20 so that the true demand distribu-

tion is the exponential distribution with mean 20 units. For the instances we consider, equations

(8) and (9) are essentially equivalent for our purposes: the difference between Mε(θ) and Mr(θ|j)

is smaller than 10−8 over all the solutions we report. For simplicity, we therefore solve (9). For

all instances we report, we find two solutions to (9). (Proposition 8 in the appendix provides con-

ditions for equation (9) to have two solutions, and our instances all satisfy these conditions.) In

all cases, one has θ in the vicinity of the true demand mean and one has θ much larger than the

mean demand such that j is very small. In what follows, we report the former solution. We will

demonstrate later that convergence is consistently to this solution.

Table 2 reports numerical solutions to (9) for several critical fractile parameters γ when errors

follow a variety of distributions: (1) normal with mean 0 and variance 1 (Mε(θ) = 1.00125, Pr{ε≤

0}= 0.5), (2) normal with mean 0 and variance 4 (Mε(θ) = 1.00501, Pr{ε≤ 0}= 0.5), (3) a 75/25

mixture of a pulse at zero and normal with mean 0 and variance 4 (Mε(θ) = 1.00125, Pr{ε≤ 0}=

0.875), and (4) a 50/50 mixture of a pulse at zero and exponential with mean 1 (Mε(θ) = 1.02632,

Pr{ε≤ 0}= 0.5). The table provides the following for each instance solved.

• γ: the critical fractile used in the stocking policy. Because the critical fractile policy does not

account for errors and because errors interfere with service, customer perceptions of service will

typically fall below γ.

14

Error Distribution γ Adj. γ Actual SL 1-E[cn] 1/θ0.70 0.680 0.571 0.795 14.840.80 0.780 0.728 0.874 17.100.85 0.831 0.799 0.909 18.00

(1) errors ∼ normal(0,1)0.90 0.881 0.865 0.942 18.780.95 0.931 0.926 0.973 19.440.99 0.971 0.970 0.995 19.900.70 0.661 0.550 0.795 14.780.80 0.761 0.708 0.873 17.070.85 0.812 0.780 0.909 17.98

(2) errors ∼ normal(0,4)0.90 0.862 0.847 0.942 18.760.95 0.912 0.908 0.973 19.440.99 0.952 0.952 0.995 19.900.70 0.690 0.669 0.719 18.870.80 0.790 0.780 0.816 19.350.85 0.840 0.834 0.863 19.54

(3) errors ∼{δ0 w.p. 3/4normal(0,4) w.p. 1/4 0.90 0.890 0.887 0.910 19.71

0.95 0.941 0.940 0.955 19.870.99 0.981 0.981 0.991 19.970.70 0.689 0.580 0.895 14.400.80 0.795 0.743 0.936 16.900.85 0.846 0.816 0.954 17.87

(4) errors ∼{δ0 w.p. 1/2exponential(1) w.p. 1/2 0.90 0.897 0.884 0.971 18.70

0.95 0.949 0.945 0.986 19.410.99 0.990 0.989 0.997 19.89

Table 2 Solutions to equation (9).

• “Adjusted γ”: the (Type 1) service level obtained by the critical fractile policy if θ were known.

The difference between this value and γ represents the impact of errors on service.

• “Actual SL”: the actual (Type 1) service level as seen by customers when j is stocked. This

differs from “Adjusted γ” because the seller’s demand estimate may deviate from the true demand

parameter.

• 1−E [cn]: the service level implied by the stockout indicators observed by the naive seller.

• 1/θ: the naive seller’s estimate of mean demand at the fixed point.

Our results conform to the finding of Proposition 3 in that the naive seller consistently underes-

timates demand. That is, we see 1/θ < 20 for every instance. We see that the magnitude of the bias

depends on the critical fractile parameter γ: the estimation error is small when γ is high and more

significant for small γ. This may be reassuring to retailers who target high service levels, although

due to several observable factors (e.g., upstream supply chain performance), in practice a retailer’s

stocking actions may fall short of its targets even before inventory errors are taken into account

(e.g., Gruen et al. 2002).

15

The magnitude of the bias also depends on the error distribution. The error variances for dis-

tributions (1) and (3) are the same, but we see that they induce very different biases. On the

other hand, (4) has the smallest variance but induces the largest biases. Inspection of equation (9)

reveals that Pr{ε≤ 0} is an important statistic for determining bias, and these results support that

observation. The error distributions (1), (2), and (4) share the same Pr{ε≤ 0}= 0.5 and induce

similar biases to each other, while (3) has Pr{ε≤ 0}= 0.875 and induces smaller biases. We make

use of the statistic Pr{ε≤ 0} in a heuristic prescription described in Section 6.3.

We observe in all of these instances not only that the service level seen by customers (“Actual

SL”) is lower than the target service level (“Adjusted γ”), but also that the naive seller’s estimate

of service level 1−E[cn] exceeds the target service level. These discrepancies are again largest for

relatively low γ and relatively low Pr{ε ≤ 0}. We conclude that the impact of inventory record

inaccuracy on demand estimation can hide, in that the naive seller may believe that it is exceeding

its service level targets even as the customers are observing service levels below the targets.

We conclude this section by addressing the question of convergence to the fixed points identified

by equation (9). Figure 1 summarizes results of simulations of the naive seller’s estimate-order-

estimate process, assuming the seller employs a Bayesian myopic policy in each period t that targets

a (Type 1) service level of γ = 0.85 in her predictive distribution f(·|at, bt). In these simulations,

the true θ= 1/20 and errors are distributed as ε∼ normal(0,4). We start the simulation from three

different gamma priors: one with mean 1/20, one which that overestimates 1/20, and one that

mainly underestimates 1/20. We simulate a repeated newsvendor for 1000 sample paths starting

from each prior. The histogram in Figure 1 represents the reciprocal of the posterior mean at the

end of 3000 periods for each sample path. We observe that the posterior means are clustered around

the fixed point reported in Table 2 (1/θ= 17.98 for this instance), indicated by the vertical line in

the figure. We have seen similar behavior for other instances and other gamma priors, leading us

to conclude that the fixed points reported in Table 2 are representative of steady-state behavior.

6. Prescriptions

Up until now, our focus has been on describing the estimation process of a seller operating under

a misspecification of the sales-generating process. Here, we briefly consider three prescriptions for

the problem, each requiring different sets of information about errors.

6.1. Full Error Information

Clearly, a tracking technology such as RFID has the potential to provide a seller full visibility

into customer-available inventory positions and hence to foster demand estimation on par with

16

15 16 17 18 19 200

50

100

150

200

250

300

350

400

reciprocal of posterior mean after 3000 periods

freq

uenc

y

a0=7.5, b0=37.5a0=7.5, b0=150a0=7.5, b0=300

Figure 1 Posterior means after 3000 time periods for θ = 1/20, γ = 0.85, ε ∼ normal(0,4), and three different

priors on demand.

our tracking seller. In this way our work identifies an important component of the value of such

tracking technologies. As highlighted in Sections 1 and 2, previous research has revealed the impact

of inventory record inaccuracy on the matching of supply and demand, although these studies have

all assumed that demand distributions are known. Our work shows that a tracking technology

has the potential not only to reduce supply-demand mismatch costs but also to yield improved

inventory management through better demand estimation.

6.2. Distributional Error Information

If the seller does not have full information on customer-available inventory levels but knows the

probability distribution H of inventory errors, a conceptually simple but computationally complex

approach is for the seller to incorporate the error uncertainty into its updates of its beliefs around

the demand parameter θ. For simplicity, we develop this for the case in which both F (·|θ) and

H are continuous distributions. In contrast to the case assuming no errors (equation (2)), here it

becomes necessary to condition on the history Jt ≡ (j1, . . . , jt) of stocking quantities rather than

the sequence of indicators Cnt = (cn1 , . . . , c

nt ). (This implies no loss of information, because one can

calculate cnτ = 1l{sτ ≥ jτ} from sτ and jτ .) Letting πHt (θ|St, Jt) indicate the posterior density of θ

at time t when H is known, we have

πHt (θ|St, Jt)∝ π0(θ)t∏

τ=1

q(sτ |θ, jτ ),

where q(sτ |θ, jτ ) is the density representing sales as a function of θ and jτ . The complementary

cumulative demand distribution associated with this density can be written as

Q(sτ |θ, jτ ) =Kjτ (jτ − sτ )F (sτ |θ)

17

with F (s|θ) ≡ 1− F (s|θ) and Kj(r) ≡ 1−Kj(r). The updating simplicity afforded by assuming

a newsvendor demand distribution does not carry over to this update, so an involved numerical

computation is required to compute πH(·|St, Jt). More importantly, this approach requires the

seller to have access to the full distribution H of errors even as she does not know the demand

distribution. For these reasons, we do not pursue this approach further.

6.3. Partial Distributional Error Information

It may be unrealistic to expect a seller who does not know the demand distribution to know the

full probability distribution of errors. Here we suggest a heuristic approach that makes use of a

specific statistic of the error distribution that is easier to estimate than the full error distribution.

In the case of newsvendor demand distributions, the demand “update” using this approach can

be expressed in closed form similar to the case without errors. Assuming exponential demands, we

will show that a “heuristic seller” (that is, a seller employing this approach) will underestimate

demand in the long run, but that heuristic seller’s underestimation of demand will be smaller than

that of the naive seller.

The heuristic is based on the observation from Table 1 that

E[cnt ] = Pr{it >dt ≥ jt}+ Pr{dt ≥ it ≥ jt}= Pr{dt ≥ jt}Pr{εt ≤ 0}.

Therefore E[cnt ]/Pr{εt ≤ 0} is an unbiased estimator for Pr{dt ≥ jt}, which is the expectation of the

tracking seller’s stockout indicator when it = jt. The heuristic is simply to base stocking decisions

on a belief on θ that takes a gamma distribution with parameters (aht , bt), where

aht = max

{1, a0 +

t∑τ=1

(1− cnτ

Pr{ε≤ 0}

)}, (10)

and bt is as defined in equation (4). The “max” in the expression for aht is simply to preclude

aht < 0 for small t, and has no impact on the long-run performance of the heuristic. Beyond this,

a comparison of equations (10) and (3) shows that we are simply dividing the stockout indicators

by Pr{ε≤ 0}.

Assuming exponentially distributed demand and following a similar analysis to the one in Section

5.1, we can characterize fixed points of the estimate-order-estimate process involving the suggested

heuristic. Analogous to equation (8), we find that such fixed points solve

θ

θ=

1− exp{−θj}1−Mr(θ, j) exp{−θj}

, where j =−1

θln(1− γ). (11)

The following proposition establishes that at fixed points, the heuristic seller underestimates

demand, but the bias is smaller in magnitude than that of the naive seller.

18

0.7 0.75 0.8 0.85 0.9 0.95 111

12

13

14

15

16

17

18

19

20

21

γ

1/θn

1/θh

0.7 0.75 0.8 0.85 0.9 0.95 111

12

13

14

15

16

17

18

19

20

21

γ

1/θn

1/θh

Figure 2 Solutions to equations (9) and (11) for θ = 1/20 and ε ∼ normal(0,4) (left) and ε ∼ 0.5δ0 + 0.5 ·exponential(1) (right).

Proposition 4. Suppose that both equations (9) and (11) have finite solutions, and let (θn, jn) be

the solution to (9) with smallest θ and (θh, jh) be the solution to (11) with smallest θ. If E[r|jn]≥ 0

then θn ≥ θh ≥ θ.

Proof. Let Rn(θ) and Rh(θ) indicate the right-hand sides of equations (8) and (11), respectively.

We first show that θn ≥ θh. Because Pr{ε≤ 0} ≤ 1,

Rn(θ) ≡1−Pr{ε≤ 0} exp

{θ

θln(1− γ)

}1−Mr(θ,−(1/θ) ln(1− γ)) exp

{θ

θln(1− γ)

}≥

1− exp{θ

θln(1− γ)

}1−Mr(θ,−(1/θ) ln(1− γ)) exp

{θ

θln(1− γ)

} ≡Rh(θ)

for all θ with Mr(θ,−(1/θ) ln(1−γ)) exp{θ

θln(1− γ)

}< 1. Because limθ→0+ R

h(θ) = 1> 0, we know

that Rh(θ)> θθ

for θ < θh. Therefore, θn

θ=Rn(θn)≥Rh(θn) implies θn ≥ θh.

Second, we show θh ≥ θ. Because θn ≥ θh, we have jn ≤ jh, which implies Mr(θ, jh)≥Mr(θ, j

n)≥

1, where the second inequality is shown in the proof of Proposition 3. It then directly follows from

equation (11) that θh ≥ θ.

We can explore the impact of the heuristic by numerically solving equation (11).2 Figure 2

compares the solutions to equation (11) with those from equation (9) by plotting 1/θn and 1/θh for

normal(0,4) errors and for a mixture of zero and exponential(1) errors. We observe that the heuristic

sharply reduces the bias for all critical fractiles γ and for both error distributions, although the

heuristic is relatively less effective for the mixture of exponential and zero. This is expected given

2 Much as we did in Section 5.2, we solve a version of (11) with Mr(θ, j) replaced by Mε(θ). The difference is negligiblefor the instances we consider.

19

Inputs: Data (St,Ct), “starting” distribution p0(θ), “jumping” distribution J(θ|θi−1).

Steps:Draw θ0 from p0.For i= 1,2, . . .

Sample θ∗ from J(·|θi−1).Calculate density ratio

ρi =πt(θ

∗|St,Ct)πt(θi−1|St,Ct)

.

Set

θi =

{θ∗ w.p. min{ρi,1}θi−1 otherwise.

Endfor

Outputs: Samples θ0, θ1, θ2, . . . .

Figure 3 Metropolis algorithm specialized to generate samples from πt(θ|St,Ct).

that the right-hand side of equation (11) deviates from one only inasmuch as Mr(θ, j) 6= 1, and

given that Mr(θ, j) = 1.02632 for this distribution versus 1.00501 for the normal error distribution.

7. Discrete Demand Distributions

A limitation of the analyses in Sections 5 and 6.3 are their assumptions of exponential demand. It is

of interest to explore the extent to which our findings extend to other demand forms. For example,

retail demand is often modeled as discrete, following a Poisson or negative binomial probability

distribution. As Braden and Freimer (1991) suggest, there does not seem to be a meaningful data

reduction for these demand distributions as for the newsvendor distributions described in Section

3.2. This makes computation of the Bayes update of equation (1) considerably more complicated,

and impossible to meaningfully express in closed form. In this section we present a framework for

numerically evaluating fixed points of the estimate-order-estimate process. We demonstrate the

framework on a set of instances assuming Poisson demand.

7.1. Procedure

We approximate the posterior demand distribution given data (St,Ct) with samples generated

using the Metropolis algorithm (see, for example, Gelman et al. 2004, Chapter 11). The Metropolis

algorithm iteratively generates a sequence of samples which, after a suitable warmup period, can

be viewed as samples from the desired posterior distribution. Figure 3 lays out the steps of the

algorithm.

Several implementation decisions are required. We calculate the mean µS and standard deviation

σS of the vector St. For the starting distribution, we use a normal distribution with mean µS and

20

standard deviation σS. For the jumping distribution, we use a normal distribution with mean θi−1

and standard deviation 0.05σS. This yields average “acceptance rate” 1− ρ between 0.30 and 0.50

for most of the instances we tried. (There is some theoretical basis for targeting an acceptance rate

around 0.44. See Gelman et al. 2004.) We approximated πt(θ|St,Ct) by 5000 samples generated

after discarding the first 1000 samples.

Given this Metropolis implementation, we use the following procedure to approximate fixed

points of the estimate-order-estimate process. For each potential integer order quantity j (from a

range of potential order quantities), we use the true demand mean θ to generate a sample path of

length T = 5000 periods, yielding values for (ST ,CnT ). Using (St,C

nT ) as inputs to the Metropolis

algorithm, we generate N = 5000 samples {θi}i=1,...,N with density πT (·|ST ,CnT ) and use these

samples to estimate the posterior mean θ(j) = 1N

∑N

i=1 θi. If j equals the critical fractile quantity

given mean demand θ(j), then we declare that j is (approximately) a fixed point of the estimate-

order-estimate process. In cases in which we find multiple fixed points for the same set of model

primitives, we report the largest of the fixed points in what follows. For large error variances and

very small target critical fractiles, we were not able to find fixed points. Simulations suggest that

the process may not converge to nonzero demand estimates for these instances.

7.2. Adapting the Heuristic Prescription

There is a straightforward adaptation of the heuristic prescription introduced in Section 6.3 to the

general case. When F (d|θ) is discrete, we modify the update of equation (1) to

πht (θ|St,Ct)∝ π0(θ)t∏

τ=1

(1−F (sτ − 1|θ))cτ/Pr{ε≤0}f(sτ |θ)1−cτ/Pr{ε≤0}. (12)

The exponent 1− cτ/Pr{ε≤ 0} may be negative and therefore hard to interpret, but this does not

cause computational issues. The density πht (θ|St,Ct) remains a valid density, and the Metropolis

algorithm can be applied to any valid density. Therefore, the procedure proposed in Section 7.1

can also be used to estimate fixed points of the heuristic seller’s process.

7.3. Illustration for Poisson Demand

We present two sets of results generated using this procedure for repeated newsvendors. In both

examples, the true demand distribution is Poisson with mean 20. The errors each period are given by

independent symmetric Skellam distributions (i.e., the distribution of the difference of two Poisson

random variables with the same means) in Figure 4(a), and by independent Poisson distributions

in Figure 4(b). The figures show both the naive and heuristic sellers’ fixed point demand means

for various error variances and target critical fractiles γ.

21

0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 115

16

17

18

19

20

21

γ

θ

naive : λ+ = 1, λ− = 1

naive : λ+ = 4, λ− = 4

heur. : λ+ = 1, λ− = 1

heur. : λ+ = 4, λ− = 4

0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 115

16

17

18

19

20

21

γ

θ

naive : λ+ = 0.5, λ− = 0

naive : λ+ = 2, λ− = 0

heur. : λ+ = 0.5, λ− = 0

heur. : λ+ = 2, λ− = 0

Figure 4 Steady state demand estimates of the naive and heuristic sellers (a) when errors follow Skellam(λ+, λ−)

distributions and (b) when errors follow Poisson(λ+) distributions.

We see that both naive and heuristic sellers underestimate the true demand mean θ= 20 just as

in Section 5, and the biases are largest when demand errors are most substantial and when target

service levels are smallest. We see that the bias depends on the shape of the error distribution

particularly through the statistic Pr{ε ≤ 0}. For example, the bias is much more significant for

Poisson(2) errors (variance 2, Pr{ε≤ 0}= 0.135) than for Skellam(1,1) errors (variance 2, Pr{ε≤

0}= 0.654). Finally, we see that the heuristic eliminates a substantial portion, but not all, of the

bias. In short, we find that the results under Poisson demand are aligned with those we saw under

exponential demand.

8. Concluding Remarks

We conclude that ignoring inventory record inaccuracy when trying to learn demand from cen-

sored observations can have a deleterious impact on a firm’s operations, leading to systematic

underestimation of demand that in turn impact the true service levels experienced by the firm’s

customers. Even as this happens, the service levels may appear to the firm to exceed targets. We

find that the firm’s demand underestimation is largest when service levels are relatively low and

when there is a significant probability of positive (i.e., stock-reducing) errors. We suggest a simple

heuristic correction that makes use of the probability of positive errors (equivalently, the probabil-

ity of non-positive errors). Our work suggests a new component of the value of inventory tracking

technologies. In addition, we complement existing literature on inventory record inaccuracy by

revealing another incentive for firms to systematically understand their inventory errors.

Our work leaves several theoretical questions unanswered. While the fixed points we have iden-

tified in our analyses seem to represent steady-state behavior in our numerical studies, we have

22

side-stepped a theoretical study of convergence. In addition, we leave open the study of optimal

stocking policies in the presence of demand learning and inventory errors. In this vein, a reasonable

question is whether the “stock more” results of Lariviere and Porteus (1999) and Ding et al. (2002)

carry over to the setting with inventory errors.

Finally, some work would be required to apply our ideas to practical settings. While the Bayesian

model we have followed is rigorous and convenient for analysis, it would be interesting to investigate

to what extent our results extend to non-parametric settings or to estimation paradigms used in

practice. We are aware of several retailers who account for censoring via extrapolation schemes.

For example, if in a seven-day week the firm believes that the customer-available inventory was

zero for two of the days, then the firm multiplies observed sales by 7/5 to approximate the week’s

demand. We leave it for future work to understand the impact of inventory inaccuracy and to

invent effective prescriptions in such settings.

Acknowledgments

The author thanks Vinayak Deshpande and Linus Schrage for comments on a draft version of this manuscript.

The author thanks the University of North Carolina Kenan-Flagler Business School for financial support.

References

Agrawal, N., S. Smith. 1996. Estimating negative binomial demand for retail inventory management with

unobservable lost sales. Nav. Res. Log. 43 839–861.

Atali, A., H. Lee, O. Ozer. 2011. If the inventory manager knew: Value of visibility and RFID under imperfect

inventory information. Stanford University. Working paper.

Bensoussan, A., M. Cakanyildirim, A. Royal, S. Sethi. 2009a. Bayesian and adaptive controls for a newsvendor

facing exponential demand. Risk and Decision Analysis 1 197–210.

Bensoussan, A., M. Cakanyildirim, S. Sethi. 2007. Partially observed inventory systems: The case of zero

balance walk. SIAM J. Contr. Opt. 46(1) 176–209.

Bensoussan, A., M. Cakanyildirim, S. Sethi. 2009b. A note on ‘the censored newsvendor and the optimal

acquisition of information’. Oper. Res. 57(3) 791–794.

Besbes, O., A. Muharremoglu. 2012. On implications of demand censoring in the newsvendor problem.

Columbia University. Working paper.

Bisi, A., M. Dada, S. Tokdar. 2011. A censored-data multiperiod inventory problem with newsvendor demand

distributions. Manufacturing Service Oper. Management 37(11) 1390–1405.

Braden, D. J., M. Freimer. 1991. Informational dynamics of censored observations. Management Sci. 37(11)

1390–1405.

23

Burnetas, A., C. Smith. 2000. Adaptive ordering and pricing for perishable products. Oper. Res. 48(3)

436–443.

Cachon, G. P., A. G. Kok. 2007. How to (and how not to) estimate the salvage value in the newsvendor

model. Manufacturing Service Oper. Management. 9(3) 276–290.

Camdereli, A. Z., J. M. Swaminathan. 2010. Misplaced inventory and radio-frequency identification (RFID)

technology: Information and coordination. Production Oper. Management 19(1) 1–18.

Chen, L. 2010. Bounds and heuristics for optimal Bayesian inventory control with unobserved lost sales.

Oper. Res. 58(2) 396–413.

Chen, L., E. L. Plambeck. 2008. Dynamic inventory management with learning about the demand distribu-

tion and substitution probability. Manufacturing Service Oper. Management. 10(2) 236–256.

Conrad, S. A. 1976. Sales data and the estimation of demand. Operational Res. Quart. 27(1) 123–127.

Cooper, W. L., T. Homem-de-Mello, A. J. Kleywegt. 2006. Models of the spiral down effect in revenue

management. Oper. Res. 54(5) 968–987.

Cooper, W. L., T. Homem-de-Mello, A. J. Kleywegt. 2009. Learning and pricing with models that do not

explicitly incorporate competition. Working paper, University of Minnesota.

DeHoratius, N., A. Mersereau, L. Schrage. 2008. Retail inventory management when records are inaccurate.

Manufacturing Service Oper. Management. 10(2) 257–277.

DeHoratius, N., A. Raman. 2008. Inventory record inaccuracy: An empirical analysis. Management Sci.

54(4) 627–641.

DeHoratius, N., Z. Ton. 2009. The role of execution in managing product availability. N. Agrawal, S. A.

Smith, eds., Retail Supply Chain Management . Springer Science+Business Media, LLC, 53–77.

Ding, X., M. Puterman, A. Bisi. 2002. The censored newsvendor and the optimal acquisition of information.

Oper. Res. 50(3) 517–527.

Gaukler, G. M., R. W. Seifert, W. H. Hausman. 2007. Item-level RFID in the retail supply chain. Production

Oper. Management 16(1) 65–76.

Gelman, A., J. B. Carlin, H. S. Stern, D. B. Rubin. 2004. Bayesian Data Analysis. 2nd ed. Chapman &

Hall/CRC.

Gruen, T. W., D. Corsten, S. Bharadwaj. 2002. Retail out of stocks: A worldwide examination of causes,

rates, and consumer responses. Washington, D.C.: Grocery Manufacturers of America.

Harpaz, G., W. Lee, R. Winkler. 1982. Learning, experimentation, and the optimal output decisions of a

competitive firm. Management Sci. 28 589–603.

Heese, H. S. 2007. Inventory record inaccuracy, double marginalization, and RFID adoption. Production

Oper. Management 16(5) 542–553.

24

Huh, W. T., R. Levi, P. Rusmevichientong, J. B. Orlin. 2011. Adaptive data-driven inventory control with

censored demand based on Kaplan-Meier estimator. Oper. Res. 59(4) 929–941.

Huh, W. T., P. Rusmevichientong. 2009. A non-parametric asymptotic analysis of inventory planning with

censored demand. Math. Oper. Res. 34(1) 103–123.

Iglehart, D. L., R. C. Morey. 1972. Inventory systems with imperfect asset information. Management Sci.

18(8) B338–B394.

Kang, Y., S. Gershwin. 2005. Information inaccuracy in inventory systems: Stock loss and stockout. IIE

Trans. 37 843–859.

Kok, A. G., K. Shang. 2007. Inspection and replenishment policies for systems with inventory record inac-

curacy. Manufacturing Service Oper. Management 9(2) 185–205.

Kok, A. G., K. Shang. 2010. Evaluation of supply chains with inventory record inaccuracy and implications

on RFID investments. Duke University. Working paper.

Lariviere, M., E. Porteus. 1999. Stalking information: Bayesian inventory management with unobserved lost

sales. Management Sci. 45(3) 346–363.

Le Cam, L. 1986. Asymptotic Methods in Statistical Decision Theory . Springer-Verlag.

Lee, H., O. Ozer. 2007. Unlocking the value of RFID. Prod. Oper. Management 16(1) 40–64.

Lee, S., T. Homem de Mello, A. Kleywegt. 2009. Newsvendor models with decision-dependent uncertainty.

Northwestern University. Working paper.

Lu, X., J.-S. Song, K. Zhu. 2005. On ‘the censored newsvendor and the optimal acquisition of information’.

Oper. Res. 53(6) 1024–1027.

Lu, X., J.-S. Song, K. Zhu. 2007. Inventory control with unobservable lost sales and Bayesian updates. Duke

University. Working paper.

Lu, X., J.-S. Song, K. Zhu. 2008. Analysis of perishable-inventory systems with censored demand data. Oper.

Res. 56(4) 1034–1038.

Mersereau, A. J. 2011. Information-sensitive replenishment when inventory records are inaccurate. Forth-

coming in Production Oper. Management.

Nahmias, S. 1994. Demand estimation in lost sales inventory systems. Naval Res. Log. 41 739–757.

Rekik, Y., E. Sahin, Y. Dallery. 2008. Analysis of the impact of RFID technology on reducing product

misplacement errors at retail stores. Int. J. Prod. Econom. 112(1) 264–278.

Ton, Z., A. Raman. 2006. Cross sectional analysis of phantom products at retail stores. Harvard University.

Working paper.

Wecker, W. E. 1978. Predicting demand from sales data in the presence of stockouts. Management Sci.

24(10) 1043–1054.

25

A. Proof of Proposition 2

The result is easy to see once we establish two lemmas.

Lemma 5. Under the assumptions of Proposition 2, Pr{cnτ = 1}−Pr{crτ = 1}= E [cnτ − crτ ]≤ η where

η < 0 is a constant with respect to jτ .

Proof. We consider two cases. First, assume Pr{ε < 0}= 0 so that Pr{jτ ≥ iτ}= 1 and case (C) of

Table 1 occurs with probability zero. Under this assumption, Table 1 implies that we can write

Pr{cnτ = 1}−Pr{crτ = 1} = −Pr{jτ >dτ ≥ iτ}−Pr{dτ ≥ jτ > iτ}

= −Pr{dτ ≥ jτ > iτ}

≤ −Pr{dτ ≥ jτ}Pr{ετ > 0}

≤ −Pr{dτ ≥M}Pr{ετ > 0}

= −Pr{ετ > 0}e−θg(M),

where the first inequality relies on the independence of dτ and iτ given jτ .

Second, assume Pr{ε < 0}> 0. The first inequality below is shown in the proof of Proposition 1.

Pr{cnτ = 1}−Pr{crτ = 1} ≤ Pr{dτ ≥ jτ ≥ iτ}−Pr{dτ ≥ iτ}

= Pr{dτ > jτ ≥ iτ}−Pr{dτ ≥ iτ > jτ}−Pr{jτ >dτ ≥ iτ}−Pr{dτ ≥ jτ ≥ iτ}

≤ −Pr{dτ ≥ iτ > jτ}

= −Pr{dτ ≥ iτ |iτ > jτ}Pr{ετ < 0}

≤ −Pr{dτ ≥M − ετ |iτ > jτ}Pr{ετ < 0}

≤ −(1/2)Pr{dτ ≥M −m|iτ > jτ}Pr{ετ < 0}

= −(1/2)Pr{ετ < 0}e−θg(M−m),

where m is the median of the distribution of ε|ε < 0.

Lemma 6. Under the assumptions of Proposition 2, limt→∞Pr{∑t

τ=1(cnτ − crτ )< 0}

= 1

Proof. Consider the random variables

x(τ) = cnτ − crτ

X(t) =t∑

τ=1

(cnτ − crτ ) =t∑

τ=1

x(τ)

Note that x(τ) is defined on {−1,0,1} and therefore its variance v(τ) is upper bounded by 4.

Because the xτ ’s are independent of each other conditional on j1, . . . , jt, therefore

V (t)≡Var (X(t)) =t∑

τ=1

v(τ)≤ 4t.

26

By Lemma 5 we have that

X(t)≡Mean(X(t))≤ ηt < 0.

The one-tailed Chebyshev inequality applied to X(t), yields

Pr{X(t)− X(t)≥ k

√V (t)

}≤ 1

1 + k2for any k > 0.

Specifically, set k=−X(t)/√V (t), then

Pr{X(t)≥ 0}= Pr{X(t)−X(t)≥−X(t)} ≤(

1 +(X(t))2

V (t)

)−1

≤(

1 +η2t2

4t

)−1

=

(1 +

η2t

4

)−1t→∞−→ 0.

To complete the proof of Proposition 2, it remains to show that∑t

τ=1(cnτ − crτ ) =∑t

τ=1(1− crτ )−∑t

τ (1− cnτ ) < 0 implies that the tracking seller’s predictive distribution stochastically dominates

that of the naive seller. Because the naive and tracking sellers stock the same quantities and are

subject to the same sequences of demands and errors, we have srτ = snτ for all τ and therefore∑t

τ=1 srt =

∑t

τ=1 snt . The desired result then follows directly from Proposition 6.3 of Braden and

Freimer (1991).

B. Analysis of Equations (8) and (9)

We begin with some definitions. We define for conciseness of notation:

P = Pr{ε≤ 0}

M = Mε(θ)

q(θ) = exp

{θ

θln(1− γ)

}q′(θ) = exp

{θ

θln(1− γ)

}(− θ

θ2ln(1− γ)

)= q(θ)

(− θ

θ2ln(1− γ)

).

Throughout this section we assume Mε(θ)> 1, for which it is sufficient to have Pr{ε= 0}< 1 and

E[ε]≥ 0. With this notation, we can write the right-hand side of equation (9) as

R(θ) =1−Pr{ε≤ 0} exp

{θ

θln(1− γ)

}1−Mε(θ) exp

{θ

θln(1− γ)

} =1−Pq(θ)1−Mq(θ)

.

We study R(θ) over the range (0, θ), where θ = θ ln(1−γ)

ln(1/M). This range corresponds with 0 < 1−

Mq(θ)< 1. We first present a lemma establishing the shape of R(θ).

Lemma 7. Suppose that E[ε] ≥ 0. Then R(θ) is convex and non-decreasing on (0, θ) with

limθ→θR(θ) = +∞ and limθ→0+ R(θ) = 1.

27

Proof. An argument very similar to one in the proof of Proposition 3 gives us that Mε(θ) ≥ 1.

Therefore R(θ)≥ 1 for all θ ∈ (0, θ). The first limit follows because 1−Mq(θ) = 0, and the second

limit follows because limθ→0+ q(θ) = 0.

We next show that R(θ) is nondecreasing on (0, θ). R(θ) is continuous with derivative

dR(θ)

dθ=

(1−Mq(θ)(−Pq′(θ))− (1−Pq(θ))(−Mq′(θ)

[1−Mq(θ)]2

=q′(θ)

[PMq(θ)−P +M −PMq(θ)

][1−Mq(θ)]2

=q′(θ)[M −P ]

[1−Mq(θ)]2≥ 0

because M ≥ 1≥ P and because q′(θ)≥ 0.

Finally, we show convexity. Letting A=−θ ln(1− γ)[M −P ] and noting that A≥ 0, we have

d2R(θ)

dθ= A

θ2[1−Mq(θ)]2q′(θ)− q(θ)[2θ[1−Mq(θ)]2 + 2θ2[1−Mq(θ)](−Mq′(θ))

]θ4[1−Mq(θ)]4

= Aθ[1−Mq(θ)]q′(θ)− 2q(θ)[1−Mq(θ)] + 2Mθq′(θ)q(θ)

θ3[1−Mq(θ)]3

=Aq(θ)

θ3[1−Mq(θ)]3

{[1−Mq(θ)]

(−θθ

ln(1− γ)− 2

)+ 2θMq′(θ)

}=

Aq(θ)

θ3[1−Mq(θ)]3

{−2 +M

[− θ

Mθln(1− γ) + q(θ)

(−θθ

ln(1− γ) + 2

)]}Showing convexity is equivalent to showing that the term in curly brackets is greater than or equal

to zero. Define z(y) = θyM

+ e−θy[θy+ 2] so that z(− 1

θln(1− γ)

)equals the term in square brackets.

We show that z(y)≥ 2/M for all y ≥ 0, which implies that the term in curly brackets is greater

than or equal to zero. Note that limy→0+ z(y) = 2≥ 2/M and limy→∞ z(y) = +∞. Because z(y) is

differentiable on (0,+∞), we can check miny≥0 z(y)≥ 2/M by verifying z(y∗)≥ 2/M at all positive

stationary points y∗ of z. All positive stationary points must satisfy

z′(y∗) =θ

M− θe−θy

∗[θy∗+ 2] + θe−θy

∗= 0,

or equivalently,

e−θy∗

=1

M

1

1 + θy∗.

Therefore z(y∗) = θy∗

M+ 1

M1

θy∗+1[θy∗+ 2] = 1

M

[1 + θy∗+ 1

1+θy∗

]≥ 2/M because the sum of a positive

number and its reciprocal is greater than or equal to 2.

Lemma 7 implies that a solution to equation (9) is the intersection of a convex function R(θ)

and a linear function L(θ)≡ θ/θ. Therefore equation (9) has at most two solutions on (0, θ). The

following proposition gives a sufficient condition for equation (9) to have two solutions.

28

Proposition 8. Assuming E[ε]≥ 0, a solution θ ∈ (0, θ) to equation (9) exists if and only if

minα∈(0,1)

− ln(

αMε(θ)

)(1− Pr{ε≤0}

Mε(θ)α)

1−α<− ln(1− γ).

Proof. By Lemma 7, R(θ) is greater than L(θ) as θ→ 0+ and as θ→ θ, therefore equation (8)

having two solutions is equivalent to there existing a θ ∈ (0, θ) such that R(θ)<L(θ), i.e.,

θ

θ>

1−P exp{(θ/θ) ln(1− γ)}1−M exp{(θ/θ) ln(1− γ)}

.

Noting that 0<M exp{(θ/θ) ln(1−γ)}< 1 for θ ∈ (0, θ), we can define α≡M exp{(θ/θ) ln(1−γ)}

and reframe the problem as seeking an 0<α< 1 such that

ln(1− γ)

ln(α/M)>

1−Pα/M1−α

.

A simple algebraic manipulation shows that this is equivalent to the desired condition.

Proposition 9. The existence of solutions to equation (9) is sufficient for the existence of a

solution to equation (8).

Proof. We have Mr(θ, j)→Mε(θ) as θ→ 0+ (equivalently, as j→+∞). Therefore, the right-hand

side of equation (8) has the limit 1 as θ→ 0+. The result follows from the continuity of the right-

hand side of (8) and the fact that the right-hand side of (8) is less than or equal to R(θ) for all

θ ∈ (0, θ) because Mr(θ|j)≤Mε(θ).

In all of our numerical studies of equation (9), we have found one solution to be in the vicinity

of the demand mean and one to be close to θ. We have observed convergence to consistently be to

the former solution. For the latter solution, the corresponding j typically yields Pr{j− ε < 0} that

is considerable, suggesting that equation (9) is probably a poor approximation of equation (8) in

the vicinity of this solution. The following result establishes that, for the solutions of interest, the

θ solution of equation (9) upper bounds the θ solution of equation (8). As discussed in Section 5.2,

the difference between these two solutions is negligible for the instances we consider.

Proposition 10. Assume E[ε]≥ 0 and let θ8 and θ9 indicate the smallest θ solutions of equations

(8) and (9) respectively, assuming they both exist. Then θ9 ≥ θ8.

Proof. Let R(θ) indicate the right-hand side of equation (9) as previously, and let R8(θ) indicate

the right-hand side of equation (8) as a function of θ.

The desired result follows because both R(θ) and R8(θ) intersect θ/θ from above at their left-most

intersections, and because Mr(θ, j)≤Mε(θ) implies that R8(θ)≤R(θ) for θ ∈ (0, θ).

Documents

Demand Estimation from Censored Observations with ......The second research stream most relevant to our work is the literature on inventory record. 4 inaccuracy and inventory misplacement.1