23
Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS SIMULATION MODELING AND ANALYSIS WITH ARENA WITH ARENA T. Altiok and B. Melamed T. Altiok and B. Melamed Chapter 7 Chapter 7 Input Analysis Input Analysis

Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Embed Size (px)

Citation preview

Page 1: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

1

SIMULATION MODELING AND ANALYSIS SIMULATION MODELING AND ANALYSIS WITH ARENAWITH ARENA

T. Altiok and B. MelamedT. Altiok and B. Melamed

Chapter 7Chapter 7

Input AnalysisInput Analysis

Page 2: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

2

•Input Analysis activities consist of the following stages:

Stage 1: data collectionStage 2: data analysisStage 3: modeling time series dataStage 4: goodness-of-fit testing

•Random variables with negligible variability are simplified and modeled as deterministic quantities.

•Unknown distributions are postulated to have a particular functional form that incorporates any available partial information.

Input Analysis Activities

Page 3: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

3

•To illustrate data collection activities, consider modeling a painting station, where

• jobs arrive at random, wait in the buffer until the sprayer is available• having been sprayed, they leave the station• suppose that the spray nozzle can get clogged – an event that

results in a stoppage during which the nozzle is cleaned or replaced. • suppose further that the measure of interest is the expected job delay

in the buffer.

•The data collection activity in this simple case would consist of the following tasks:

1. collection of job inter-arrival times2. collection of painting times3. collection of times between nozzle clogging4. collection of nozzle cleaning/replacement times

Data Collection

Page 4: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

4

•Data Analysis deals with statistics of empirical data:• statistics related to moments (mean, standard deviation, coefficient of variation, etc.)• statistics related to distributions (histograms)• statistics related to temporal dependence (autocorrelations within an empirical time

series, or cross-correlations among two or more distinct time series)

•For example, consider the sample of 100 repair time observations12.9 27.7 13.5 13.7 22.220.9 26.6 29.1 22.4 10.730.0 27.4 18.8 25.3 15.017.0 21.7 13.7 15.5 23.211.0 27.5 22.5 27.1 25.210.3 18.0 11.5 14.1 24.010.9 27.0 24.2 25.6 22.421.0 21.3 23.1 15.8 13.222.8 25.9 22.4 13.8 16.610.8 10.3 15.1 19.0 27.920.5 19.4 10.9 24.1 10.922.2 25.5 17.2 10.9 15.614.3 29.9 17.8 19.8 17.613.3 24.0 29.7 18.1 28.428.6 26.9 20.7 22.0 16.819.4 27.4 22.5 28.3 27.118.9 11.9 13.2 10.9 22.116.7 28.5 19.9 18.5 16.512.7 18.1 15.0 21.0 25.719.5 11.9 22.9 23.2 18.9

Data Analysis

Page 5: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

5

•Data Analysis of the repair time data produced the histogram and summary statistics shown below

Data Analysis Example

Page 6: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

6

•Independent observations are modeled as a renewal time series, namely, a sequence of iid random variables. In this case, the analyst’s task is to merely

identify (fit) a “good” distribution and its parameters to the empirical data.• Arena provides built-in facilities for fitting distributions to empirical data.

•Dependent observations are modeled as random processes with temporal dependence. In this case, the analyst’s task is to identify (fit) a “good”

probability law to empirical data. This is a far more difficult taskthan the previous one, and often requires advanced mathematics.

• Arena does not provide facilities for fitting dependent random processes• An advanced method is described, however, in Chapter 10

•Examples:

• Observed sequences of arrival times to a queue are often modeled as iid exponential inter-arrival times (i.e., Poisson processes)

• For observed sequence of times to failure and the corresponding repair times, the associated uptimes may be modeled as a Poisson process, and the downtimes as a renewal process or as a dependent process (e.g., Markov process)

Modeling Time Series Data

Page 7: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

7

•The simplest approach is to construct a histogram from the empirical data (sample), and then normalize it to a step pdf or a pmf, depending on the

underlying state space. The obtained pdf or pmf is then declared to be the fitted distribution. The main advantage of this approach is that no assumptions are required on the functional form (shape) of the fitted distribution.

•The previous approach may reveal (by inspection) that the histogram pdf has a particular functional form (e.g., decreasing, bell shape, etc.). The analyst may then try to obtain a better fit, by postulating a particular class of distributions having that shape, and then proceeding to estimate (fit) its parameters from the sample, using such common techniques as the method of moments and the maximum likelihood estimation (MLE) method. This approach can be further generalized to multiple functional forms by searching for the best fit among a number of postulated classes of distributions.

• The Arena Input Analyzer provides facilities for both fitting approaches.

Modeling Empirical Distributions

Page 8: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

8

•The method of moments fits the moments of a candidate model to sample moments, using appropriate empirical statistics as constraints on the candidate model parameters.

•As an example, consider a random variable X and a data sample whose first

two moments, and are estimated as and .

• Write the formulas for the mean and variance of a gamma distribution, connecting the first two moments of a gamma distribution with its parameters, and , namely

• Substitute into the above the previous estimates

• Solve the above equation to obtain

18 5.m =

Method of Moments

2125 3.m =

1m

2m

ba

1

21

ˆ

( )ˆ

m

m

a b

a b b

=

= +

8 5

1 125 3

ˆ .ˆ

ˆ ˆ( ) .ˆ

a b

a b b

=

+ =

0 62 13 74ˆ. , .a b= =

Page 9: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

9

•The Maximal-likelihood Estimation (MLE) method postulates a particular class of distributions (e.g., normal, uniform, exponential, etc.), and then estimates their parameters from the sample, such that the resulting parameters

give rise to the maximal likelihood (highest probability or density) of obtaining the sample. More precisely,

• Let be the postulated pdf, as a function of its ordinary argument, , as well as the unknown parameter (possibly be a vector of parameters, but here is assume a scalar for simplicity)

• Let be a sample of independent observations

•The MLE method estimates via the likelihood function

Maximal-likelihood Estimation (MLE)

( ; )f x q xq

1, ,

Nx x¼

1( , , ; )

NL x x q¼

1 1 2( , , ; ) ( ; ) ( ; ) ( ; )

N NL x x f x f x f xq q q q¼ = L

Page 10: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

10

•For the exponential distribution Expo( ) with parameter ,

• the corresponding maximal likelihood function is

• the log-likelihood function is

• the value of that maximizes is obtained by

differentiating it with respect to and setting the derivative to zero, that is

• solving the above in yields the maximal likelihood estimate

•For the uniform distribution Unif(a,b), a similar computation yields the MLE estimates

MLE Method Examplesˆ ˆq l=

1 1ln ( , , ; ) ln( )

N

N

iiL x x N xl l l

=å¼ = -

l1

ln ( , , ; )N

L x x l¼l

1 10ln ( , , ; )

N

N

ii

d NL x x

dxl

l l =å¼ = - =

l

1

1ˆN

ii

Nxx

l

= =

1 1ˆmin{ : }, max{ : }ˆi i

a x i N b x i N= £ £ = £ £

l

11 21

( , , ; )

Ni

iNN

xx x x NL x x e e e ell l l

l l l l l =-- - - å

¼ = =L

Page 11: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

11

The Arena Input Analyzer is a tool that fits a distribution to sample data.

Arena-supported distributions and their parameters

Distribution Arena Name Arena Parameters

Exponential EXPO Mean

Normal NORM Mean, StdDev

Triangular TRIA Min, Mode, Max

Uniform UNIF Min, Max

Erlang ERLA ExpoMean, k

Beta BETA Beta, Alpha

Gamma GAMM Beta, Alpha

Johnson JOHN G, D, L, X

Log Normal LOGN LogMean, LogStdDev

Poisson POIS Mean

Weibull WEIB Beta, Alpha

Continuous CONT P1, V1, …

Discrete DISC P1, V1, …

The Arena Input Analyzer

Page 12: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

12

Best-fit uniform distribution for the repair time data

Page 13: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

13

Best-fit beta distribution for the repair time data

Page 14: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

14

Best-fit gamma distribution for a sample of lead time data

Page 15: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

15

Fit All Summary for a sample of lead time data

Page 16: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

16

Goodness-of-Fit Tests for Distributions•Tests of goodness-of-fit for distributions determine the

likelihood that an empirical sample is drawn from a given distribution

• a statistical hypothesis is formulated• a statistic is computed from the empirical data• the distribution of the statistic is assumed known under the null

hypothesis, allowing the computation of the probability that it exceedsthe observed value

• rejection or acceptance decisions can be taken at a given significance

level, but these are subject to Type I and Type II statistical errors

•Common goodness-of-fit tests for distributions:

1. Chi-Square test2. Kolmogorov-Smirnov test

Page 17: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

17

•The Chi-Square test compares the empirical histogram density, constructed from sample data, to a candidate theoretical density

• assume that the empirical sample is a set of iid realizations from an underlying (unknown) random variable, .

• this sample is used to construct an empirical histogram with cells, where cell corresponds to the interval

•The estimator of the probability of cell is

• is the number of observations in cell

• it is commonly suggested to take for statistical reliability)

Chi-Square Test

1, ,

Nx x¼ N

Jj [ , )

j jl r

jN

j

5j

N >

X

j

1, , ,ˆ j

j

Np j J

N= = K

Pr{ [ , )}j j j

p X l r= Î

Page 18: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

18

•Let be some theoretical candidate distribution of the random variable whose goodness-of-fit is to be assessed

•Compute the corresponding theoretical probabilities

• for continuous data we have

where is the density of

•The Chi-square test statistic is then given by

Chi-Square Test (Cont.)( )

XF x

X

1

( )j j

j

J

j

N N p

N pc

-=2

2

1Pr{ [ , )} ( ) ( ), , ,j j j X j X j

p X l r F r F l j J= Î = - = K

1( ) ( ) ( ) , , ,j

jlj X j X j X

rp F r F l f x dx j Jò= - = = K

( )Xf x X

Page 19: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

19

•As an example, consider the repair time sample data of size N = 100, given earlier, for which a histogram with J = 10 cells was constructed by the Input Analyzer

•The table below displays the elements of the Chi-Square test for the repair data

Cell Number

CellInterval

Number ofObservations

Relative Frequency

TheoreticalProbability

1 [10,12) 13 0.13 0.102 [12,14) 9 0.09 0.103 [14.16) 8 0.08 0.104 [16,18) 9 0.09 0.105 [18,20) 12 0.12 0.106 [20,22) 8 0.08 0.107 [22,24) 13 0.13 0.108 [24,26) 10 0.10 0.109 [26,28) 10 0.10 0.10

10 [28,30) 8 0.08 0.10

j [ , )j jl r

jN ˆ

jp

jp

Chi-Square Test Example

Page 20: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

20

• The histogram of the repair data suggests that a uniform distribution Unif(a,b) is an acceptably good fit to the sample repair data

• The parameters of the uniform distribution are estimated as:

• The Chi-Square statistic computation yields

• A Chi-Square table shows that for significance level and degrees of freedom, the critical value is

• Since the test statistic computed above is , we accept the null hypothesis that the uniform distribution Unif(10,30) is an acceptably good fit to the sample repair data

13 10 8 103 6

10 10

( ) ( ).c

- -= + + =

2 22 L

0 13 0 10 0 08 0 10 0 0036[ ] ( . . ) ( . . ) .ˆj jj

e p p2 2 2 210

1L

=å= - = - + + - =

2

Chi-Square Test Example (Cont.)

1 10 1 30ˆmin{ : } , max{ : }ˆi i

a x i N b x i N= £ £ = = £ £ =

0 10.a=10 2 1 7 - -d= = 12 0.c=

2 3 6 12 0. .c = <

Page 21: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

21

•The Kolmogorov-Smirnov (K-S) test compares the empirical cdf

to a theoretical counterpart• while, the Chi-Square test requires a considerable amount of data

(at least to set up a reasonably “smooth” histogram), the K-S test can get away with smaller samples, since it does not require a histogram

•The K-S test procedure proceeds as follows:• sort the sample is ascending order as

• constructs the empirical cdf

• construct the K-S test statistic

The smaller is the observed value of KS, the better is the fit

Kolmogorov-Smirnov Test

, ,N

x x¼1 ( ) ( )

, ,N

x x¼1

( )max{ : }

ˆ ( ) j

X

j x xF x

N

<=

ˆmax{ : | ( ) ( )|}X X

KS x F x F x= -

Page 22: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

22

•A mode of a distribution is that value of its associated pdf or pmf at which the respective function attains a maximal value

•A uni-modal distribution has exactly one mode

•A multi-modal distribution is one whose associated pdf or pmf is of the following form:

1. It has more than one mode2. It has only one mode, but it is either not monotone increasing to the left of

its mode, or not monotone decreasing to the right of its mode

Thus, a multi-modal distribution has a pdf or pmf with multiple “humps”

•One approach to Input Analysis of multi-modal samples is:

1. Separate the sample into mutually exclusive uni-modal sub-samples 2. Fit a separate distribution to each sub-sample3. The fitted models are then combined into a final model according to the

relative frequency of each sub-sample

Multi-Modal Distributions

Page 23: Altiok / Melamed Simulation Modeling and Analysis with Arena Chapter 7 1 SIMULATION MODELING AND ANALYSIS WITH ARENA T. Altiok and B. Melamed Chapter 7

Altiok / Melamed Simulation Modeling and Analysis with ArenaChapter 7

23

•Consider a sample of observations such that• observations appear to form a uni-modal distribution in an interval

• observations appear to form a uni-modal distribution in an interval

•Suppose that the theoretical distributions, and , are fitted separately to the respective sub-samples

•The combined distribution to be fitted the entire sample is defined by

• The distribution above is a legitimate distribution, formed as a

probabilistic mixture of the two distributions, and

Multi-Modal Distribution ExampleN

N1

I1

N2

I2

N N N+ =1 2

( )F x1

( )F x2

( ) ( ) ( )X

N NF x F x F x

N N= +1 2

1 2

( )F x1

( )F x2