27
Graphical Interactive Simulation Input Modeling with Bivariate E%zier Distributions MARY ANN FLANIGAN WAGNER Purdue University and JAMES R. WILSON North Carolina State University A graphical interactive technique for modeling bivariate simulation inputs is based on a family of continuous univariate and bivariate probability distributions with bounded support that are described by B6zier curves and surfaces, respectively. This family of distributions has a natural, extensible parameterization so that all parameters have a meaningful interpretation; and the complete family is capable of accurately representing an unlimited variety of shapes for marginal distributions together with many common types of blvariate stochastic dependence. This ap- proach to simulation input modeling is implemented in a Windows-based software system called Pmw-probabilistic Input Modeling Environment. Several examples illustrate the application of PRIME to subjective and data-driven estimation of bivariate distributions representing simula- tion inputs. Categories and Subject Descriptors: G.3 [Mathematics of Computing]: Probability and Statis- I,ics-staz%tical software; 1.6.5 [Simulation aml Modeling]: Model Development—modeling m eth odo[ogtes; 1.6.7 [Simulation and Modeling]: Simulation Support Systems—en Luronmen ts General Terms: Algorithms, Design, Theory Additional Key Words and Phrases: Graphical interactive distribution fitting 1. INTRO13LKHION One of the central problems in the design and construction of stochastic simulation experiments is the selection of valid input models—that is, This work was partially supported by a David Ross Grant from the Purdue Research Foundation and by NSF grant DIvE-87 17799. A preliminary version of this work was presented at the 1994 Winter Simulation Conference, which was held December 11–14, 1994, in Ch-lando,Florida, and was sponsored by ASA, ACM, IEEE, IIE, NIST, ORSA, TIMS, and SCS. Authors’ addresses: Mary Ann Flanigan Wagner, Boeing Information Services, 7990 Boeing Court, Vienna, VA 22183-7000: email: (maf lani@w~. ecld; James ~. Wilso% Department of Industrial Engineering, North Carolina State University, Raleigh, NC 27695-7906; email: (]wllsOn@:eOs .ncsu. edu). Permission to make digital/hard copy of all or part of this material without fee is granted provided that the copies are not made or distributed for profit or commercial advantage. the ACM copyright/server notice, the title of the publication. and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery, Inc. (ACM). To COPY otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. 01995 ACM 1049-3301/95/0700-0163 $03.50 ACM Transactions on Modehng and Computer Simulation, Vol. 5, No. 3, July 1995, Pages 163-189.

Graphical interactive simulation input modeling …jwilson/files/wagner95tomacs.pdfGraphical Interactive Simulation Input Modeling with Bivariate E%zier Distributions MARY ANN FLANIGAN

  • Upload
    lammien

  • View
    231

  • Download
    0

Embed Size (px)

Citation preview

Graphical Interactive Simulation InputModeling with Bivariate E%zier Distributions

MARY ANN FLANIGAN WAGNER

Purdue University

and

JAMES R. WILSON

North Carolina State University

A graphical interactive technique for modeling bivariate simulation inputs is based on a family ofcontinuous univariate and bivariate probability distributions with bounded support that aredescribed by B6zier curves and surfaces, respectively. This family of distributions has a natural,extensible parameterization so that all parameters have a meaningful interpretation; and thecomplete family is capable of accurately representing an unlimited variety of shapes for marginaldistributions together with many common types of blvariate stochastic dependence. This ap-proach to simulation input modeling is implemented in a Windows-based software system calledPmw-probabilistic Input Modeling Environment. Several examples illustrate the applicationof PRIME to subjective and data-driven estimation of bivariate distributions representing simula-tion inputs.

Categories and Subject Descriptors: G.3 [Mathematics of Computing]: Probability and Statis-I,ics-staz%tical software; 1.6.5 [Simulation aml Modeling]: Model Development—modelingm eth odo[ogtes; 1.6.7 [Simulation and Modeling]: Simulation Support Systems—en Luronmen ts

General Terms: Algorithms, Design, Theory

Additional Key Words and Phrases: Graphical interactive distribution fitting

1. INTRO13LKHION

One of the central problems in the design and construction of stochastic

simulation experiments is the selection of valid input models—that is,

This work was partially supported by a David Ross Grant from the Purdue Research Foundationand by NSF grant DIvE-87 17799. A preliminary version of this work was presented at the 1994Winter Simulation Conference, which was held December 11–14, 1994, in Ch-lando,Florida, andwas sponsored by ASA, ACM, IEEE, IIE, NIST, ORSA, TIMS, and SCS.Authors’ addresses: Mary Ann Flanigan Wagner, Boeing Information Services, 7990 BoeingCourt, Vienna, VA 22183-7000: email: (maf lani@w~. ecld; James ~. Wilso% Department ofIndustrial Engineering, North Carolina State University, Raleigh, NC 27695-7906; email:(]wllsOn@:eOs .ncsu. edu).

Permission to make digital/hard copy of all or part of this material without fee is grantedprovided that the copies are not made or distributed for profit or commercial advantage. theACM copyright/server notice, the title of the publication. and its date appear, and notice is giventhat copying is by permission of the Association for Computing Machinery, Inc. (ACM). To COPY

otherwise, to republish, to post on servers, or to redistribute to lists requires prior specificpermission and/or a fee.01995 ACM 1049-3301/95/0700-0163 $03.50

ACM Transactions on Modehng and Computer Simulation, Vol. 5, No. 3, July 1995, Pages 163-189.

164 . M A. Flanigan Wagner and J R, Wilson

probability distributions that accurately mimic the behavior of the random

input processes driving the system. In many applications, it is critical not

only to capture the shape of the marginal distribution of each major input

random variable but also to accurately represent the stochastic dependencies

between those variates [Lewis and Orav 1989, p. 291]. For example, in

modeling the arrival streams of liver-transplant donors and patients for the

UNOS Liver Allocation Model [Pritsker et al. 1995], initially we had to model

the stochastic dependence between the age and weight of each new arrival;

and ultimately we had to expand our stochastic model to include the sex and

blood type of each new arrival.

Although many practitioners appreciate the need for valid models of multi-

variate simulation inputs, they lack effective and widely available took for

building such input models. Stanfield and Wilson [1993] developed a tech-

nique for fitting a multivariate distribution when the correlation matrix and

the first four moments for each marginal distribution have been specified or

estimated by the user. Because the fitted joint distribution is built from

univariate marginals belonging to the Johnson translation system [Johnson

1949a; Swain et al. 1988], the multivariate input-modeling technique of

Stanfield and Wilson has substantial flexibility. Unfortunately, the fitted

joint distribution does not belong to the multivariate Johnson translation

system [Johnson 1949b; Johnson 1987]; moreover, the corresponding condi-

tional distributions do not belong to the Johnson system—and this lack of

“closure” makes it impossible to obtain convenient closed-form expressions for

the conditional distributions that naturally arise in many applications. Al-

though DeBrota et al. [1989] developed a graphical interactive software

system that enables the user to edit (manipulate) univariate bounded John-

son distributions, it is unclear that a similar tool could be based on the

multivariate distribution-fitting procedure of Stanfield and Wilson.

Other approaches to multivariate input modeling can be based on TES

(Transform-Expand-Sample) processes [Jagerman and Melamed 1992a,1992b; Melamed et al. 1992] and ARTA (AutoRegressive To Anything) pro-

cesses [Cario and Nelson 1995]. Both methodologies enable the user to specify

the autocorrelation function out to an arbitrary lag for a univariate stochastic

process with a user-specified marginal distribution, but ARTA processes seem

to be substantially easier to use. Unfortunately, the conditional distributions

associated with TES and ARTA processes do not appear to possess any

advantages in analytical or numerical tractability when compared to multi-

variate processes based on the Johnson translation system. Software pack-

ages for fitting TES and ARTA processes are not widely available at thepresent time.

In this article we extend the univariate input-modeling methodology of

Wagner and Wilson [1993, 1995] to handle continuous bivariate populations

with bounded support, and we present a flexible, interactive, graphical

technique for modeling a broad range of bivariate simulation inputs. We

employ B6zier surfaces as the parametric form for representing the distribu-

tion function of continuous bivariate random vectors that are to be randomly

sampled in a simulation experiment, and we show that the corresponding

ACM Transactions on Modeling and Computer Simulation, Vol. 5, No 3, July 1995.

Simulation Input Modeling . 165

marginal and conditional distributions belong to the original family of uni-

variate B6zier distributions. We implemented this methodology in a Microsoft

Windows-based software system called PRIME —l?Robabilistic Input Modeling

Environment. A public-domain version of the software is available upon

request.

The remainder of this article is organized as follows. In Section 2 we

summarize the main properties of univariate B6zier distributions that are

relevant for our development of bivariate B6zier distributions, and we estab-

lish some basic notation that is used throughout the paper. In Section 3 we

detail our methodology for constructing, manipulating, and sampling bivari-

ate B6zier distributions as well as the associated marginal and conditional

univariate B6zier distributions. In Section 4 we describe the implementation

of this methodology in PRIME, including techniques for interactively fitting

bivariate simulation input models using subjective information (expert opin-

ion) or sample data. In Section 5 we present some examples illustrating the

diversity of bivariate distributions that can be modeled using this methodol-

ogy. Finally, in Section 6 we summarize the main contributions of this work,

and we make recommendations for future research. Although this paper is

based on Flanigan [1993], some of our results were also presented in Wagner

and Wilson [1994].

2. OVERVIEW OF UNIVARIATE BEZIER DISTRIBUTIONS

2.1 Definition of B4zier Curves

In many applications of computer graphics, a B&ier curve is used to approxi-

mate a smooth (continuously differentiable) univariate function on a bounded

interval by forcing the B6zier curve to pass in the vicinity of selected control

points {p, = (x,, Z,)T: i = O, 1,..., n}. (Throughout this paper, all vectors will

be column vectors unless otherwise stated; and the reman superscript T will

denote the transpose of a vector or matrix so that each control point is

understood to be a column vector.) A B6zier curve of degree n with control

points {pO, pi,... , P.} is given parametrically by

P(t) = [Pz(t; n,x), PZ(t; n,z)lT = ~B.,,(t)p, for t ~ [0,11, (1)~=o

where x - (X O, Xl,..., x~)T and z = (Z O, ZI, ..., z~)T, and where the blend-

ing function B .,,(t) k the Bernstein polynomial

[

n!

Bn,l(t) ~ i!(n – i)!tl(l – t)n-’ , fort =[0, l] and i= O,l,..., n,

o, otherwise.

(2)

B6zier curves have certain characteristics that are particularly important for

graphically based approximation of functions [Farin 1990]:

(a) A B6zier curve exactly interpolates its initial and final control points; thismeans that the curve will pass through these control points.

ACM Transactions on Modeling and Computer Simulation, Vol 5, No. 3, July 1995.

166 . M. A. Flanlgan Wagner and J. R. Wilson

(b) A B6zier curve is edited under global control; this means that any changein the location of a control point affects the shape of the entire curve.

In the definition (a) of the B6zier curve {P(t): t = [0, 1]},notice that for

each t G [0, 1], we have E;.. 1?.,, ( t)= 1 because the ith Bernstein polyno-

mial B. ,( t ) can be interpreted as the binomial probability of i successes in n

independent trials with success probability t on each trial; thus the B&zier

curve traced by P(t) for all t G [0, 1] lies in the convex hull of the control

points {p,: i = 0,1, ..., n}. Although B6zier curves are edited under global

control, the effect of the ith control point p, on the shape of the curve is

greatest at the value t = i/n for the parameter t. In particular, as t in-

creases from O to 1, the weight B., O(t ) of the initial control point PO

decreases from 1 to O and the weight B .,.( t) of the final control point pn

increases from O to 1 in the overall weighted average (convex combination) of

control points that determines the current location P(t) on the B&zier curve.

Thus the control points act like “magnets”; and the “magnetic attraction”

exerted on the B6zier curve {P(t): t E [0, I]} by the ith control point p, is

strongest at the value t = i/n for the parameter t so that the corresponding

point P(t) on the curve is “in the vicinity” of p,. If the weight (magnetic

attraction) of a control point is 1, then the B6zier curve is forced to pass

through that control point exactly.

2.2 Formulation of Univariate Bezler Probability Distributions

In this section we summarize briefly some key properties of univariate B6zier

distributions. For a detailed development of these properties, see Wagner and

Wilson [ 1993, 1995]. Given a continuous random variable X with bounded

support [ x*, x”] and unknown cumulative distribution function (c.d.f.) Fx (. ),

we can approximate Fx(. ) arbitrarily closely by a Biizier curve of the form (1)

with sufficiently high degree n, where

x(t) = ~B@r,~=o

1

for all t G [0,1].

Fy[x(t)] = ~ B~,, (t)z,~=fJ

(3)

If Fx(.) is given parametrically by Equation (3), then the correspondingprobability density function (p.d.f.) &-(.) is given parametrically by

x(t) = i Bn,, (t)x,l=(J

n–1

/

n–1 Ifor all t S [0,1], (4)

f’x[x(t)l = z Bn-l, z(t)Azt Z B..l,, (t) Ax,~=o ~=o

ACM Transactions on Modeling and Computer .%mulatlon, Vol 5, No. 3, July 1995

Simulation Input Modeling . 167

with

Axl =Xl+l ‘Xl

Azt =Zl+l ‘Zl)

fori=O, l,..., n–l.

In Wagner and Wilson [1993, 1995], we presented several applications of

the B6zier family of univariate distributions for modeling simulation inputs.

This distribution family has a natural, extensible parameterization that

allows unlimited flexibility in representing the probabilistic behavior of many

real-world processes. Moreover, because its numerical evaluation can be

performed efficiently, this family is well suited to graphical interactive

simulation input modeling. These considerations motivated the extension to

bivariate B6zier distributions that is detailed in the next section.

3. FORMULATION OF BIVARIATE BEZIER DISTRIBUTIONS

We begin the development of bivariate B6zier distributions by considering in

Section 3.1 the setup for general two-dimensional B6zier surfaces. In Section

3.2 we specialize this setup to obtain a parametric representation for the c.d.f.

FXY(”, ‘ ) corresponding to the B6zier random vector (X, Y )T with prespeci-

fied univariate marginal B6zier c.d.f.’s F’x(”) and FY( “); and in Section 3.3 we

establish the required properties of the marginal distributions. In Section 3.4

we derive the parametric form of the bivariate Bi$zier p.d.f. fx y (”, ” ). In

Section 3.5 we formulate the conditional c.d.f.’s Fxly(” I” ), Fy ,x(”\. ); and inSection 3.6 we calculate the covariance between X and Y. An efficient

method for generating bivariate B6zier random vectors is presented in

Section 3.7.

3.1 Definition of Bezier Surfaces

Starting from a set of control points represented by the column vectors

{qL, J - (~1, j, Y1,J, ZL. J)T: i = 0,1,.., ~x; ~ = 0, 1)7 ny}, We have the Corre-

sponding two-dimensional B6zier surface in three-dimensional Euclidean

space that is given parametrically as

Q(tx, ty) = [Qx(tx, ty; nZ, ny, x), Qy(tx, ty; nx, nJ, y), QZ(tx, ty; nx, n,y, z)]T

= ~ ; Bnt, L(tx)Bny,,(ty)( xL,,, y,,,, z,,j)T~=oj=o

= ; t Bni, t(tl)Bn,,,(ty)qL,,2=OJ=0

(5)

for all tx, ty G [0, 1], where

‘“[XLJ]=FO:l :]

ACM Transactions on Modeling and Computer Simulation, Vol 5, No 3, July 1995.

168 . M. A. Flanigan Wagner and J. R, Wilson

[Yo, o Yo,l ““” 3’0, nv

Yl, o Yl,l ““” Yl, nyY=[yt, ll= . , . .

\ Ynz,o Yn=,l ““” Ynz,rz,L

and

Z=[z,j]=

respectively denote the ( n ~ +

~o, o 2.,1 . . . ‘O, n Y

~1, o 21,1 ““” ‘l, nY

z “ i?n ~ ‘-. Zn,,n., ., nY

>

1

1) x (n, + 1) matrices of the x-, y-, and z-

coordinates of the given control points:

Extending the discussion at the end of Section 2.1 about the role of theBernstein polynomials (2) in regulating the shape of a B6zier curve, we see

that the geometry of a B6zier surface is determined by weights of the form

B~,,,(t, )BmY,~(tY), where for each (tX, tY)T = [0, 1] X [0, 1], we have

~ ~ B~r,,(tX)BZ=OJ=O

.yj(t)=[~OBn,(,)][}oB.,,(y)]=ll=l

Thus the B&zier surface {Q(tX, ty) : (t,, ty)T ● [0, 1] X [0, 1]} lies in the convex

hull of the control points {q,, ~: z = O, 1,..., nX; j = O, 1,..., nY}. In particular

as both tx and ty increase from O to 1, the weight B.,, o(tX)B~Y, 0( t,) of the

initial control point qo, o decreases from 1 to O and the- weightB ~z2~JtX)B .,, .JtY) of the final control point qnt,,, increases from O to 1 in theoverall weighted average (convex combination) of control points that deter-

mines the current location Q(tx, ty ) on the B6zier surface. Thus the control

points act like magnets; and the magnetic attraction exerted on the B6ziersurface {Q(tx, ty) : (tx, ty)T G [0, I] X [0, I]} by the control point qL, J is

strongest at the values t,= i/n Z and ty= J“/nY for the parameters tx and tY

so that the corresponding point Q( tx,ty) on the surface is in the vicinity of

q,,,. If the weight (magnetic attraction) of a control point is 1, then the B~ziersurface is forced to pass through that control point exactly.

3.2 Bivariate Bezier Distribution Functions

If (X, Y )’ is a continuous random vector with bounded support [.x*, x*] x

[Y., y’], unknown c.d.f. F.~Y(”, “ ), and unknown p.d.f. &Y(., . ), then we canapproximate Fry (., . ) arbitrarily closely with an appropriate B6zier surfaceof the form (5) that has sufficiently large values of n ~ and n ~ [Farin 1990],

where the control points {q,, J: i = O, 1,....nX; j = O, 1,....ny} have been

arranged so as to ensure the basic requirements of a joint distribution

function: (a) FAY~,( x, y) is monotonically nondecreasing and continuous from

the right in x and y; (b) FXY(XX, y) = O for all y and FXY(X, y*) = O for all

ACM TransactIons on Modehng and Computer S1mulatlon, Vol. 5, NCJ 3, July 1995

Simulahon Input Modeling . 169

x; (c) FXY(X*, y*) = 1; and (d) FXY(ZZ, Yz) – FXY(XI, Y2) – Fxy(xz, Yl) +

FXY(XI, yl) >0 if xl < X2 and YI < Yz.Given marginal I%zier c.d.f.’s Fx(.) and FY(. ) for the random variables X

and Y, respectively, we seek a joint c.d.f. Fx Y(., “ ) for the random vector

(X, Y)T, where (i) Fx(”) is represented parametrically by

{x(tJ, Fx[x(tX)]}T ===~ BJtJ[xz, @]T~=()

(6)

for all tX ● [0, 1]; and (ii) FY(.) is represented parametrically by

{y(tY), FY[y(ty)]}T = ~ B#y)[yJ, Z;Y’]T (7)J=()

for all tY ● [0, 1]. To satisfy Equations (6) and (7), we formulate the joint c.d.f.

of ( X, Y )T according to the parametric representation (5) such that

{x(tJ,y(tY),Fxy [x(tX),y(tJ]}T

= ~ : Bnl,z(tx)Bn,,,(ty)(x,,,,y,,,, z,>,)T (8)~=OJ=o

for all tz, ty E [0, 1], where the matrices ~, y, and z of x-, y-, and z-coordi-

nates of the control points {q,, ~} have the respective forms

[X. X. ““” X.

, 1Y(l Y1 ““” Yn,

Yo Y1 ““” Yn,Y=[Y,, JI= . . . 7

. .

. . . .Yo Y1 ““” Yn,

(9)

and

[ ~

00

0 ‘Z1,l

Z=[z,, )l = : :

. . . 0 0

.. .‘1, n>–1

(x)z~

~ ‘1

. . (lo)

Oz ll– 1,1 ““”~:x~ ~

Zn –1,7L–1 ,t

[o (Y) . . .

z~Z(Y)

nj–l 11

The special form of the matrix ~ ensures that in the definition (5) of the

associated B6zier surface, we have

n,

ACM Transactions on Modeling and Computer Simulation, Vol. 5. No. 3, July 1995

170 . M. A. Flanigan Wagner and J. R. Wilson

thus Equation (11) defines a univariate B6zier function of tx alone, which for

simplicity is subsequently denoted by x( t,)as in Equation (8). Similarly, the

special form of the matrix y ensures that

Qy(t,,ty;n.,n,,y) = f Bny,,(ty)y,; (12)~=o

thus Equation ( 12) is a univariate B6zier function of t,Y alone, which for

simplicity is subsequently denoted by y(tY ) as in Equation (8).

To satisfy the basic requirements (a)–(d) of a bivariate c.d.f, mentioned in

the first paragraph of this section, we impose the following conditions on the

matrices x, y, and z. The condition

x(j =x*, YO =Y. , and z,. j =0 ifi=O or j=O (13)

is sufficient to ensure requirement (b); and the condition

x =X*, Y., =.v*, and z,,,,., = 1 (14)~i

is sufficient to ensure requirement (c). Finally, it follows from the results

presented in Section 3.4 on bivariate B6zier p.d.f.’s that requirements (a) and

(d) are satisfied if

+2 ,J>oZL+l,J+l —~l, j+l — ~t+l,j (15)

fori=O, l,..., nlandj=O, l, l,... ,nY–l.

Remark. If the individual x- and y-components of successive control

points are nondecreasing (that is, if XO < xl < ““” < x., and yO s yl < “.” <

y.,), then the functions x( t,)and y(tY) respectively defined by Equations (11)and (12) are strictly increasing on [0, 1]; consequently for each point (a, ~ )T~ [X*, x*] x [y*, y*], there is a unique pair

(tl,tJT = [x-’(cd, y-’3)]T]T such that [x(tI), y(tY)]T = (CK, /3)T.

Thus we see that in the parametric representation (8) of a bivariate B4zier

c.d.f., the coordinate functions x(. ) and y(. ) have well-defined inverses x-1(. )

and y-1 ( .). This property will play a central role in the variate-generationscheme detailed in Section 3.7. In the next section we examine in detail the

marginal distributions for X and Y that follow from this setup for the

bivariate B&zier distribution of the random vector (X, Y )T.

3.3 Marginal Distributions

Marginal cd. f.’s. To justify Equations (6) and (7), we calculate F.Y[ x( tx)]for

a fixed tX G [0, 1] by taking ty= 1 in Equation (8) and applying Equation

ACM TransactIons on Modeling and Computer Simulation, Vol. 5, No 3, July 1995

Simulation Input Modeling . 171

(10); thus we obtain

Fx[a?(tx)] = Fxy[x(tz), y(l)]

(16)

n,

where ]~h.~) - 1 if k = Y’ and Ifh.., - 0 if k #~. An expression similar toEquation (16) can be obtained for FY[ Y( tY )1 when tY ~ [0, 11. Combining

Equations (11), (13), (14), and (16), we see that the bivariate B6zier c.d.f.

FXY(., . ) will have univariate marginal B6zier c.d.f.’s of the form (6) and (7).

Marginal pd. f.’s. It follows immediately from Equation (4) that if X has

the marginal c.d.f. (6), then the corresponding marginal p.d.f. has the para-metric represe~tation {x(tX), fx[ x(tt)]}T for all tZ ~ [0, 11, where ~(tX) is

given by Equation (11) and

nx–l

~ B~,__l,,(tX)Az}x)

Analogous expressions describe the marginal p.d.f. of Y for all t-v G [0, 1].

Marginal Moments. Wagner and Wilson [1995] derived computational

formulas for the marginal moments of X and Y, and these results are

summarized here for completeness. The expected value of X is given by

and a similar formula yields the expected value of Y. Closed-form expressions

analogous to Equation (17) can be given for the higher-order noncentral

moments of a B6zier variate [Flanigan 1993], but these expressions arecumbersome to evaluate.

ACM TransactIons on Modeling and Computer Simulation, Vol. 5, No. 3, July 1995

172 . M. A. Flanigan Wagner and J R. Wilson

If X is nonnegative, then an efficient numerical method for computing the

moments of X can be based on the following result,

E[X~] = /&(t X)] ‘-’{1 –~y[x(t i)]}lx’(tx)ld. fork = 1,2,, (18)o

where x’( t,) is the derivative of x( t.) with respect to t..If X has a (finite)

negative lower bound x*, then Equation (18) can be applied to calculate the

noncentral moments of the nonnegative random variable Xx - X – x ~. It

follows immediately that E[ Xl = E[ Xx 1 + x,,, and the remaining central

moments of X coincide with the corresponding central moments of XX. Thus

the standard deviation, the skewness, and the kurtosis of X are the same as

for X*. A parallel development yields the marginal moments of Y.

3.4 Bhariate Bezier Density Functions

F’or the bivariate random vector (X, Y )T whose joint c.d.f. Fx y (., “ ) is speci-

fied by Equation (8), the corresponding joint p.d.f. fxy ( ~, “ ) is given paramet-

rically by {x(tX), y(tj), fxy[x(t, ), y(tv)l}T for all ti, tY ~ [0, 11; and in this

section we derive an explicit expression for this p.d.f. in terms of the Bern-

stein polynomials and control points defining the c.d.f. Using the functions

Qz(tx, ty ), x(t, ), and y(t,) respectively defined by Equations (5), (11), and(12), we apply the chain rule for partial differentiation to Q,( t.,ty) in orderto obtain

~i

Qz(~x,ty) = ~td;tFxy[x(t.t), Y(ty)]dtxdt

Y x y

and rearranging Equation (19), we obtain the desired parametric representa-

tion of the bivariate B6zier p.d.f.,

The first derivative of an nth-degree B6zier function is an (n – l)st-deg-ee

B6zier function (see Section 5.11 of Farin [1990]); and it is straightforward to

verify that

(21)

ACM TransactIons cm Modeling and Computer S]mulatlon, Vol 5, No 3, July 1995

Simulation Input Modellng . 173

An analogous formula holds for y ‘(tY ). It follows that

where

for i = O, 1,..., rzX – 1 and j = O, 1,..., nY – L Combining Equations (20),

(21), and (22), we see that the joint density is given parametrically for alltx, t, G [0, 1] as

Remark. At the end of Section 3.2, we formulated conditions sufficient to

ensure that x ‘(tX) > 0 and y ‘(ty) > 0 for all t,, ty G [0, 1]. From condition

(15) and Equations (23) and (24), it also follows that fxY{ x(tl ), Y( t,)} >0 for

all tX, tY ● [0, 1] as required for a legitimate p.d.f.

3.5 Conditional B6zier Distributions

Given Y = y(t, ), the conditional c.d.f. of X at the point x( tx ) is obtained as

follows. If tX, t, G [0, 11 and fY[y(ty)l >0, then

Fxly[x(tx)Iy(t.)1

ACM Transactions on Modeling and Computer Slmulat,on, Vol. 5, No 3, July 1995.

174 - M. A. Flanigan Wagner and J. R. Wilson

It follows that the conditional distribution of X given Y = y(tY ) is univariate

B6zier with

provided ~y[ y(tY’)] >0. Notice that the control points {[ x,, z~xlyJ]T: i =0, 1, . . . . nt} for the B&zier curve representing the conditional c.d.f. of X given

Y = y(t~ ) have the same x-coordinates as in Equation (6); and the corre-

sponding z-coordinates are given by

nv—l

Z 8Z-,,,(~yYZ,,,+I -z,,,]Z(XIY) = ,=0

[ nv—l (26)

for z = O, 1,... , n ~. An analogous formulation yields Fy ,x( . I ~), the condi-

tional c.d.f. of Y given X.

ACM Transactions on Modeling and Computer Simulation, Vol 5, No. 3, July 1995

Simulation Input Modeling . 175

3.6 Covariance between Bezier Variates

The covariance between X and Y, Cov( X, Y ), is readily computed from the

control points {ql, ~; i = O, l,. ... nf; .j = O, l,. ... nY}. We have

COV(X, Y)

=/”/”[ x – pxl[y –pylfxy(~>Y)ci~clY—. —.

= nxny/l/l[ x(u) - I%yl[y(w) - Pyl (27)00

where

fori=O,l,..., nX and

ACM Transactions on Modeling and Computer Simulation, Vol. 5, No. 3, July 1995

176 . M. A. Flanigan Wagner and J. R. Wilson

forj=O, l,. ... rzY. Notice that the expected value E[ X] (respectively, E[ Y 1)

and the variance Var[ X ] (respectively, Var[ Y ]) are readily evaluated using

the computational formulas (17) and (18). Thus we can easily compute

Corr( X, Y ), the correlation between X and Y.

3.7 Generation of B6zier Vectors

The random vector (X, Y )T can be generated efficiently using a variant of the

method of conditional distributions that we call “ piecewise conditioning.” To

explain piecewise conditioning, we first summarize the conventional method

of conditional distributions.

Method of Conditional Distributions. Given a pair of independent random

numbers UI and Uz, we compute Y from UI by inversion of the marginal

c.d.f. FY(. ); then given Y, we compute X from Uz by inversion of the~lY(.lY). Specifically, this involves the following steps.conditional c.d.f. F

1.

2.

3

4,

5.

[Generate random numbers Ul, Uz.] Generate Ul, Uz - Uniform[O, 1] indepen-dently.

[Compute Z, = Y- 1[F= l(U1)I.I Use a root-finding procedure such as bisection search[Conte and de Boor 1980] to find the root f, of the equation

q

U1 =FY[Y(~.y~] = ,;O%,,,WT (28)

within the interval of uncertainty [O, 1].

[@mPute control Points of F.YIY[ I Y( i,~l.1 Evaluate Equation (26) with t, = i, tocompute the z-coordinates {Z~z-lyl. ~ = O, I, . . . . ~ ~} of the control points for theconditional c.d.f. of X given Y = y(~}).

[Compute ~X= x-1 {F’~l~[ Uz IY( i, )]}.] Use a root-finding procedure such as bisectionsearch to find the root tx of the equation

u, = F.yly[x(il)ly(iy)] = ~ Bn,,, (ix)zy)Z=()

within the interval of uncertalnt y [0, 1].

[Return (X, Y )T.] Deliver the vector

(29)

[ IT

(X, Y)T= [x(i), y(iJ]T= fll,lx,l(;.t)x,,; Bny,,(i,)y, (30)~=~ J=(J

The chief disadvantage of the conventional method of conditional distribu-

tions is that the root finding operations required to solve Equations (28) and

(29) are relatively slow.

Method of Piecewise Conditioning. The objective of the method of piece-wise conditioning is to accelerate the root-finding operations required to solve

Equations (28) and (29) by exploiting a precomputed partition of the range of

values of the marginal c.d.f. of Y together with the corresponding precom-

puted partitions of the range of values of the conditional c.d.f. of X given Y. Aformal statement of this algorithm is given in the following.

ACM TransactIons on Modeling and Computer Slmulatlon, Vol 5, No 3, July 1995

Simulation Input Modeling . 177

Piecewise Conditioning Algorithm

O. [ Initialize—set up partitions of the ranges of F’Y() and F’x,~( I ~).1

a. Compute the partition of the range of the function Fy(. ) on [ y*, Y*1,

[y(ty(g)), FY{y(tv(g))}lT forg = 132, . . ..g~ax. (31)

where the cutoff values {tv(g ): g = 1,2,. ... g~..} are regularly spaced in [0, 11.

b. For each y(t,(g))(g = 1,2,..., g~., 1 compute the partition of the range of thefunction FxlY[.ly(t,(g))] on [.~.,, X*I,

[x(tz(h)), F’xlY{x(tL(h))l y(tY(g))}lT forh = 1,2,..., h~~k, (32)

where the cutoff values {tX(h): h = 1,2, ..., h~.x} are regularly spaced in [0, 1].1, [Generate random numbers UI, Uz. 1 Generate U], u? - Uniform[O, 11 indepen-

dently.

2. [Compute ;Y = y-l[F~l(ul)]. ]

a. Find the subinterval of uncertainty for fY corresponding to the appropriatesubinterval of the partition (31); that is, find ~ such that

Fy[y(ty(g – l))] < UI s FY[.Y(tY(E))l implies ty(g – 1) < ~, S ty(+?). (33)

b. Starting with the initial interval of uncertainty (33) for ~Y, use a modifiedbisection search to find the root ~y of Equation (28) in the interval (33).

3. [Compute control points of Fxly[.l y(t,)l. 1 Evaluate Equation (26) with t., = f,

to compute the z-coordinates {Z$xly ’: i = O, 1, ..., n ~} of the control points for theconditional c.d.f. of X given Y = y(iY).

4. [Compute ZX= x-l{l’j+.[Uz \y(ZY)l}.

a. Find a subinterval of uncertainty-for iX corresponding to the partition (32)defined by g = ~ – 1; that is, find hl such that

Fxly[x(tJk, - l))ly(t,(g- 1))]< u, s Fxly[x(t.(kl))lY(t,(~ - q (34)

b. Find a subinterval of uncerta@ty for ~X corresponding to the partition (32)defined by g = ~; that is, find h ~ such that

Fxly[x(tJk2 - l))ly(t,(m] < u, = FxlY[x(tx(~2))lY(~y(~))] (35)

implies tz(kz – 1) < i. S tX(Zz).

C. Combine Equations (34) and (35) to obtain an interval of uncertainty for ~Xcorresponding to Y = y(fY):

min{ti(il – l),t~(kz – 1)} < iX S max{tX(kl), tX(kz)}. (36)

d. Starting with the initial interval of uncertainty (36) for iX, use a modifiedbisection search to find the root it of Equation (29) in the interval (36).

5. [ Return (X, Y )T. ] Deliver the vector (30).

ACM Transactions on Modeling and Computer Simulation, Vol. 5, No. 3, July 1995.

178 . M, A Flanigan Wagner and J. R, Wilson

Table I. Performance of Schemes for Generating 10,000 Blvarlate B6zier Vectors

Samphng Method

Performance Condltwnal Pzece wise

measure dlstrlbutlons condltlonlng

For nz = n, = 6:Setup time (sec.) 0.0 1.37Generation time (sec.) 60.8 390

Forrz2 = nv = 9:Setup time (sec.) 0.0 2.36Generation time (see ) 1003 77.7

On each iteration of the modified bisection search, the search procedure

terminates if either of the following criteria are satisfied:

(a)

(b)

The half-length of the latest interval of uncertainty is less than a user-

specified tolerance &l so that ZI is also an upper bound on the absolute

difference between the exact and estimated values of the B6zier variate to

be generated.

The user-specified tolerance Sz is an upper bound on the absolute differ-

ence between the value of the current random number and the value of

the current c.d.f. at the latest estimate of the B6zier variate to be

generated.

In our implementation of the modified bisection search, we used c1 = ez =~~--6

The method of piecewise conditioning for generating bivariate B6zier vari-

ates has been implemented in a function written in the C programming

langaage. In this procedure the parameters g~~X and h~~X have the default

values g~,X = h~,. = 101; and compared to the method of conditional distri-

butions, the additional storage requirement for the piecewise conditioning

procedure is 20,503 floating-point data items (words). Table I contains a

comparison of the method of conditional distributions and the method of

piecewise conditioning with respect to setup time and time to generate 10,000

bivariate vectors for the B6zier distribution described in Section 5.2. All

execution times were measured on a 75-MHz 80486-based microcomputer.

The results displayed in Table I are typical of our computational experi-

ence. The storage requirements and execution times for both schemes are

modest; and to generate 10,000 bivariate B6zier random vectors, the methodof piecewise conditioning is approximately 50% (respectively, 25%) faster

than the conventional method of conditional distributions for n, = n ~ = 6

(respectively, for nX = n-y = 9). We believe that the method of piecewiseconditioning is sufficiently fast for most practical applications.

4. MODELING BIVARIATE BEZIER DISTRIBUTIONS USING PRIME

PRIME is a graphical Windows-based software system that incorporates the

methodology developed in Section 3 to help an analyst estimate the bivariate

ACM TransactIons on Modeling and Computer Simulation, Vol. 5, No. 3, July 1995.

Simulation Input Modeling . 179

input processes arising in simulation studies. PRIME is designed for IBM-

compatible microcomputers equipped with a math coprocessor and a pointing

device such as a mouse. Written entirely in the C programming language,

PRIME has been developed to run under Microsoft Windows [Microsoft Corp.

1990] version 3.0 or later. A public-domain version of PRIME is available upon

request.

4.1 Interactive Operations of PRIME

PRIME is designed to be easy and intuitive to use. The construction of a

bivariate distribution is performed through the actions of the mouse, and

several options are conveniently available through menu selections. In PRIME

the user manipulates the marginal distributions independently of each other;

then to complete the construction of a bivariate input model, the user

manipulates the joint p.d.f. (24) or selected conditional c.d.f.’s like (25). For

example, to edit (subjectively estimate) the marginal c.d.f. (6) of X as dis-

played in the upper right-hand corner of Figure 1, the user may move any of

the control points {[ x,, z~x)]T: i = O, 1,..., n.} by clicking on a chosen control

point in the window depicting Fx(. ) and then dragging that control point to

the desired location by moving the mouse. Control points are represented as

small black squares, and each control point is given a label corresponding to

its index i in Equation (6). The user may also add or delete control points via

the mouse and the keyboard.

The number of control points that can be used in PRIME is limited only by

the amount of available computer memory because storage for control points

is allocated dynamically as required. However, increasing the number of

control points increases the computational time to evaluate the B6zier func-

tions; and this in turn affects the speed of display updating, random vector

generation, and distribution fitting.

As elaborated at the end of Sections 2.1 and 3.1, each control point acts like

a magnet that pulls the associated curve or surface in the direction of the

control point. Moving a control point causes the displayed distribution to be

updated (nearly) instantaneously so that the user gets immediate feedback

on the effects of editing that distribution. To model the stochastic dependence

between the components of the B6zier random vector (X, Y )T, the user must

employ trial and error in manipulating the control points for the joint p.d.f. or

for selected conditional distributions. In our experience it is usually easier to

achieve the desired bivariate dependence by editing some well-chosen condi-

tional distributions. The number of conditional distributions that can be

edited simultaneously is limited only by the available computer memory. In

contrast to the impact on PRIME’S performance of increasing the number of

control points, increasing the number of displayed conditional distributions

does not significantly decrease the speed with which PRIME’S displays are

updated. Figure 1 depicts a typical PRIME session, showing a bivariate joint

density &Y(., . ) with COV(X, Y) = 2.287 and Corr(X, Y) = 0.324 together

with the corresponding marginal c.d.f.’s.

ACM Transactions on Modeling and Computer Simulation, Vol. 5, No. 3, July 1995.

180 . M A, Flanlgan Wagner and J. R. Wdson

3 X8Y3P13S.131V F.~ile Edit lJisplay ~ptions

~:0.0600 f[x Y)

Covariance: 2.287

Correlation: 0.324

0.0600

x Y

1 .0

-./

a Prime -X8 Y3POS.BIV

~ile Edit Fi~ windows

Qptions Ijelp

1.000 T0.800 +

,-,../

0,600t

E’,/0.400 + w’

0.2004 /-”1/-~

0.000 k-il~* 0.0 2.0 4.0 6.0 8.0 10.0

~ile Edit Fi~ windows

Qptions Help

1.000

0.800

0.600

0.400[

F(y]W-E

B ./

/“/./’

W“,/

/ K

t0.200 /

0.000 !LL+++++@ o.o 2.0 4.0 6.0 8.0 10.0

Fig 1. A typical bwarlate PRIME session.

4.2 Data-Driven Estimation of Bivarlate B6zier Distributions

In addition to subjective estimation of bivariate B6zier distributions by

interactive manipulation of the control points, PRIME allows data-driven

estimation of the control points that yield the “best” flt to the sample data

according to a variety of statistical-estimation principles. Suppose that a

random sample {( X~, Y~)T: k = 1,2, . . . . m} has been taken from an unknown

continuous bivariate distribution, and we seek to approximate this distribu-

tion with a bivariate B6zier c.d.f. Fx ~-(., “ ). Let

Fn(x, y)

1—— —[no. of pairs (X~, YIZ)T such that X~ < x, Yh < y fork =l,..., m]

m

denote the corresponding empirical c.d.f., and let F~,( x ) and Gn( y ) denote the

corresponding marginal empirical c.d.f.’s for X and Y, respectively. The first

step in fitting a bivariate B6zier distribution to the data set {( Xh, Yk )T:

k=l,2,..., m} is to fit a marginal distribution to each component of this

ACM T,amsactlonson Modellng and Computer Simulation. Vol 5, No 3, JUIY 1995.

Simulation Input Modeling . 181

sample separately. For completeness, we summarize the scheme for data-

driven estimation of univariate B6zier distributions that has been imple-

mented in PRIME as detailed in Wagner and Wilson [ 1~95].

We seek to fit a univariate marginal B6zier c.d.f. Fx[.; nl, x, z(x)] of the

form (6) to the data set {X~: k = 1,..., m}, where

For this purpose we select an appropriate distance function ~{~m(”), flx[.; ~1,

x z(x)]} between the empirical c.d.f. ~n(.) and the fitted c.d.f. ~’[.; nX, x, z(x)].

P;IME uses a nonlinear optimization procedure based on the Nelder-Mead

simplex search algorithm [Nelder and Mead 1964; Olsson and Nelson 1975]

to solve the problem

(minimize d Fn(. ),l$x[.; rz, ,x, z(x)x, Z’x)

1)\

subject to &[x(tX); n,, x, z(x)] > 0 for all tX G [0,1]

(.x) —20 —o ), (37)z(x) = 1

~.

X() < x(l)x ~z > x(m)

1

are the order statistics for the sample { X~ }. The classical fitting methods that

have been incorporated into PRIME include: least squares estimation, mini-

mum LI norm estimation, minimum LX norm estimation, maximum likeli-

hood estimation, moment matching, and percentile matching. Selecting one of

these fitting schemes from a drop-downAmenu implicitly involves selecting an

appropriate distance function d{ F~(. ), I’x[.; nX, x, Z(x ‘]}. The same procedure

is used to fit a univariate marginal B6zier c.d.f. ~Y[.; ny, y, Z(‘)] to the

sample data set {Yh }. Our computational experience indicates that in compar-

ison to other widely used optimization procedures, the Nelder-Mead simplex

search procedure is faster and more stable in solving the optimization prob-

lem (37) for each of the distribution-fitting methods mentioned; see Flanigan

[1993, P. 44] and Swain et al. [19:8].After fitting marginal c.d.f.’s F’x[; rzX, x, z(y)] and fiy[.; nY, y, z(y)] sepa-

rately to the corresponding components of the random sample {( X~, Yk )T:

k = 1,2,..., m}, the user can model the dependencies between these compo-

nents. The dependencies are modeled by moving the control points associated

with either the joint B6zier p.d.f. fxY (”, . ) or selected conditional B&zier

c.d.f.’s or p.d.f.’s until the desired stochastic dependence is achieved. In thenext section we illustrate both subjective and data-driven estimation of

bivariate B6zier distributions using PRIME.

ACM Transactions on Modeling and Computer Simulation, Vol. 5, No 3, July 1995

182 . M, A, Flanlgan Wagner and J, R. Wilson

3 NEGPROC.BIV

~le Edit lJisplay Qptions

db0.900000L fk Y]

Covariance: -0.230

Correlation: -0.369

0.9000

x

F[x]1.000 ~ /0.800

L

/

0.600 ;~(

0.400

/

f $L

0.200{,(.

0.000 ‘ i r-4$ 0.01.22 .43,64.86.0

F[y)1.0001- --0.800

0.600

L

,/’

0.400 f0.200 .J ‘>. >

~o.000_.

@ 0.0 Z.O 4.0 6.0 8.0 10.0

. F~lX=x]

close Edit show

LJpdate !ndep

FfT’lX=2.5]1.000 T 1 >f

!5L-JL0.02.04 .06.08.010.0

F~lY=y]

Qlosc Edit show

Llpdate !ndep

F~lY=7.0]1.000

L

/1, ---0.800 i ‘~0.600

/

Ii

0.400 / !!,

0.200 /,/ “<..20.000 -i

0.01 .22.43.64.86.0

Fig 2. A bivariate distribution of processing times (X, Y)T with Corr(X, Y) = – 0.369

5. EXAMPLES

5.1 Subjectively Fitted Bivariate Bezier Distributions

In the absence of data, PRIME can be used to construct a bivariate input

process conceptualized from subjective information and expertise. For exam-

ple, suppose it is known that the processing times for two successive manu-

facturing operations are negatively correlated, with a correlation coefficient

of —0.37. The processing time is denoted by X, where X is known to have a

minimum value of 2 minutes, a maximum value of 6 minutes, and a most

likely (modal) value of 3 minutes. The second processing time is denoted by

Y, where Y is known to have a minimum value of 3 minutes, a maximumvalue of 9 minutes, and a most likely (modal) value of 6 minutes. To construct

the joint distribution of (X, Y )T, we must first specify the two marginal

distributions. Figure 2 shows the marginal distributions that were built by

moving the control points corresponding to the respective marginal c.d.f.’s

until we matched the lower bounds x,: and y*, the upper bounds x * and y*,

and the modes that were specified. After the marginal c.d.f.’s were fitted, we

manipulated the stochastic dependence between X and Y by moving the

control points associated with the conditional c.d.f.’s FY ,X(.12.5) and Fx ,Y(“17.0)

ACM TransactIons on Modehng and Computer Simulation, Vol 5, No 3, July 1995

.Simulation Input Modeling . 183

> UNDWNEG.131V F~lX=x]

Eile Cdit Display Qptions close Edit show

it 0.1 fk Y]k

Covariance: -5.322

Correlation: -0.645

0.1000

..-,?

x

“\.-

~pdate !ndep

F~lX=9.0]1.000

L

—---....-———0.800 .,,’

0.600 ,,’

0.400 /“

0.200 &;, - -----

0.000

H 0.02.04 .06.08.010.0

~r....LJpdate Indep

F~]Y=2.0]1.000

0.800

i

,’”:

,,,’0.600

1

,’”0.400

0.200 ___.--——

0.000 , t-+--! I

F[x]1.000 ~].800

+

..

1.600.,-

..

F[y]1.000

i

...0.800

0.600

* 0.02.04 .06.08.010.0 @ 0.0 2.0 4.0 6.0 8.0 10.01,

0.02.04.06,08.0100

Fig. 3. A negatively correlated distribution with uniform marginals.

until the correlation of the fitted joint distribution was approximately equal

to – 0.37. The conditional c.d.f.’s are edited in. the same manner as the

marginal c.d.f.’s, except that the control points for the conditional c.d,f.’s are

only allowed to move in the vertical direction so as to preserve the marginal

distributions and the special structure of the matrices x and y defined by

Equation (9). In Figure 2 the displayed joint distribution has a correlation of

– 0.369.

5.1.1 Uniform Marginal Distributions. Figure 3 depicts a PRIME session in

which each marginal distribution is uniform; that is, X N Uniform[O, 10] and

Y - Uniform[O, 10]. Figure 3 displays a bivariate distribution for (X, Y )T

with COV(X, Y) = —5.322 and corr(X, Y ) = —0.645. Beneath the window

containing the joint p.d.f., there are two windows displaying the marginal

c.d.f.’s Fx(.) and FY(.); and these latter windows also display as dashed

curves the corresponding marginal p.d.f.’s fx(.) and fy ( .). To the right of the

joint p.d.f. window in Figure 3 are two windows depicting the conditional

c.d.f.’s FY ,X(.19.0) and FXIY(”12.0); and these c.d.f. windows also display as

dashed curves the corresponding conditional p.d.f.’s. As shown in the jointp.d.f. window of Figure 3, most of the probability mass is concentrated along

theline y=–x+lOfor O<x <10.

ACM Transactions on Modeling and Computer Simulation, Vol 5, No 3, July 1995

184 . M, A. Flanlgan Wagner and J. R, Wilson

. X5 Y4POS.BIV

Eile Edit Display LMions

it 0.2k m Y]

Covariance: 2.133

Correlation: 0.539

0.2000

x Y

1 .0

FIx]I .000

F(y]

T

,--- 1.000D.800 If 0.800 T ,(——

Llpdate !ndep

F~lX=3.0]1.000

0.800

L

/0.600

0.400 /

0.200 1/“-– \-

0.000 4 i

0.04.08.0 12.(16.[20.0

R+- F~lY=y]

fJose Cdit show

update !ndep

FpqY=8.o)1.000

0.800

L

/–’-

0.600

0.400

0.200 /’-’~,J:-”. /..-

0.000/

0.02.04 .06.08.010.0

Fig. 4 A positwely correlated distribution with nonumform marginals

5.1.2 Nonuniform Marginal Distributions. Figure 4 depicts the joint c.d.f.

and p.d.f. of the nonuniform random vector (X, Y )T. Notice that for this case,

COV(X, Y) = 2.133 and Corr(X, Y) = 0.539. Figure 4 also shows the condi-

tional c.d.f.’s FY ,x (.13.0) and FXIY(”18.0).

5.2 A Bivariate Bezier Distribution Fitted to Sample Data

In a small manufacturing simulation study, the initial machining time X and

the subsequent rework time Y were recorded for m = 44 workpieces; then

the sample data set {( Xh, Yk )T: k = 1, ..., m} was imported into PRIME. Table

II displays the sample statistics for each marginal distribution together withthe corresponding population characteristics for the marginal B6zier distribu-

tions that were fitted in PRIME by the method of moment matching. The

fitting algorithm for this example took approximately 20 seconds on a 75-MHz

80486 microcomputer.Figure 5 contains three PRIME windows depicting key aspects of the empiri-

cal and fitted distributions.

1. In the left-hand window, the fitted joint p.d.f. &Y (., . ) is superimposed on

a bivariate histogram of the sample data set.

ACM Transactions on Modeling and Computer Simulation, Vol 5, No 3, July 1995

Simulation Input Modeling . 185

Table II. Empirical vs. Fitted Joint Distribution of Machining and Rework Times

Machining Time Xk Rework Time Yk

Characteristic Sample Fitted dist. Sample Fitted dLst,

Mean 4.781 4.781 4.647 4.647Variance 2.863 2.863 2.7’83 2.783Skewness 0.193 0.193 0.334 0.334Kurtosis 1.872 1.872 1.967 1.967Minimum 0.340 –0.151 0.500 –0.054Maximum 9.700 10.590 9.500 10.237

a BIVDATA.BIV &i

~le Edit Qisplay Qptions

4t 0.0400 w Y]kCovariance: -1.993

Correlation: -0.257

0.0400

x Y

11 .2

I

== Prime - BIVDATA. BIV ~~~*

~le Edit Fi! windows

Qptions Help

0.000 kL—tt—A@ .0.22.1 4.3 5.5 8.8 11.1

~le Edit Fi! windows

C)ptions Help

F[y]1.000

1/

.P0.800 mr+f

0.600

0.400 ..

0.000 E.w 1 I ! 1 I@ .0.12,0 4.1 5.1 8.2 10.2

Fig. 5. A bivariate distribution to fit to sample machining and rework times.

2. In the upper right-hand window, the fitted c.d.f. flx[”; rzx, Z, z(x)] of initial

machining times is plotted together with the empirical e.d.f. Fro(.).

3. In the lower right-hand window, the fitted c.d.f. fiy[.; rzy, Y, z(y)] of rework

times is plotted together with the empirical c.d.f. Gin(”).

After each marginal distribution was satisfactorily fitted to its correspond-

ing component of the sample {(XL, Yk )T}, the stochastic dependence between

X and Y was modeled. The control points for the joint p.d.f. ‘were manipulated

ACM TransactIons on Modeling and Computer Simulation, Vol. 5, No. 3, July 1995.

186 . M, A, Flanigan Wagner and J R Wilson

until the theoretical covariance Cov( X, Y) for the fitted distribution matched

the sample covariance G( X, Y ) = – 1.979. For this example, interactively

editing the bivariate B6zier p.d.f. took approximately 30 seconds on a 75-MHz

80486 computer.

6. SUMMARY, CONCLUSIONS, AND RECOMMENDATIONS

6.1 Bivariate Bezier Distribution Families

If ( X, Y )T is a continuous random vector having a bivariate B6zier distribu-

tion function FXY(”, “ ) as defined in Section 3, then the distribution of(X, Y )T has the following properties.

1. The joint c.d.f., represented parametrically by Equation (8) as a B6zier

surface, is similar in form to a B&zier curve. The Bernstein polynomials (2)

are the same basis functions used for both B6zier curves and B6zier

surfaces.

2. The joint p.d.f., fxY(”, “ ), has a closed-form parametric representation as aratio of B6zier functions, as given by Equation (24).

3. The conditional c.d.f.’s, F’xll .(”I . ) and Fyl,y(”l ), have the same parametricform as the univariate marginal c.d.f.’s; and the control points that define

the conditional c.d.f.’s are easily related to the control points that define

the joint c.d.f.

4. The conditional p.d.f.’s, fxlY( 1. ) and fYlx(.l . ‘), have the same parametric

form as the univariate marginal p.d.f.’s; and the control points that define

the conditional p.d.f.’s are easily related to the control points that define

the joint c.d.f.

5. The covariance between X and Y, Cov( X, Y ), has a closed-form expression

given by Equation (27).

6. The parameterization of the bivariate B6zier distribution family is both

natural and open-ended. The coordinates of the control points define the

distribution parameters, and if additional flexibility is required, it is easily

achieved by adding more control points.

6.2 Modeling Simulation Inputs with PRIME

From the user’s point of view, PRIME is an easy-to-use, intuitive, graphical

software system. PRIME provides immediate visual feedback on the user’s

editing of the currently displayed distribution. The user can easily alter an

inappropriately configured distribution by adding, deleting, or relocating one

or more of the relevant control points for the joint p.d.f., the marginal p.d.f.’s

or c.d.f.’s, or selected conditional c.d.f.’s or p.d.f.’s. PRIME also provides a

framework for viewing and manipulating bivariate distributions.

6.3 Recommendations for Future Work

Several aspects of this work require further research and development,

1. Of particular interest is the extension of the methodology to handletrivariate and higher-dimensional distributions. In principle such an ex-

AC!M TransactIons on Modeling and Computer Simulation. Vol 5, No. 3, Ju] y 1995.

Simulation Input Modeling . 187

tension is feasible but cumbersome. Specifically, a trivariate E%zier distri-

bution function FWX Y(”, “ , . ) for the continuous random vector ( W, X, Y)T

with bounded support could be given parametrically by a three-dimen-

sional EKzier surface in four-dimensional Euclidean space, as follows:

R(t,o, tl, ty) = {w(tw),x(tx),y(ty),Fwxy[w(tu,),x(tx),y(ty)]}T

= 5 5 i ~nu,,,(t,,))~nl,t(~x)~ny,,(~.y)~/,,,,j(38)

<’=oi=o]=o

for all tu,, t,, ty = [0, I], where the triply subscripted hypermatrices

1w=[w/,l, J , x=[xr. ,J], ] and Z=[ZY=[Y,,,, j $ /,,,,1 (39),,

each consist of (nU, + 1) x ( nX + 1) X (n ~ + 1) elements respectively defin-

ing the w-, x-, y-, and z-coordinates of the control points

)i= O,l,..., nX; j= 0,1, nY., nY .

To ensure that the c.d.f. F’WYY-( ~, ., “ ) will have the parametric representa-

tion (38) as well as given univariate B6zier marginal c.d.f.’s J’W(” ), Fx(”),

and FY (” ), we must formulate appropriate extensions of Equations (9) and

(10) that apply to the hypermatrices defined in Equation (39). Moreover, itis desirable to ensure that the bivariate marginals F’wx(., . ), F’Wy (”, . ),

and FXy ( ., . ) should have the same functional form (8) as the bivariate

I%zier c.d.f.’s developed in this article. We have verified that the desired

extension of Equations (9) and (10) to trivariate and higher-dimensional

B6zier distributions is theoretically feasible, but the resulting trivariate

distribution is awkward to use.

Because the desired parametric form (38) for the trivariate B6zier c.d.f.

can be arranged in principle, the analyses of the associated joint p.d.f., of

the conditional B&zier distributions, and of the covariance between any

two B6zier variates will parallel closely the developments presented in

Sections 3.4, 3.5, and 3.6, respectively. Although the sampling scheme of

Section 3.7 can be adapted to the generation of trivariate B6zier vectors, it

is unclear whether such a sampling scheme is generally practical for

higher-dimensional I%izier vectors.

2. For subjective estimation of continuous biva riate distributions, we also

require more comprehensive techniques for visually representing and

manipulating general types of stochastic dependence.

3. For data-driven estimation of continuous bivariate distributions, we re-

quire fully automated fitting schemes to estimate not only the marginal

distributions of the target random vector but also the stochastic depen-

dence between the two components of that random vector.

All these topics are the subject of ongoing work.

ACM TransactIons on Modeling and Computer Simulation, Vol. 5, NO 3, July 1995

188 . M, A. Flanlgan Wagner and J. R Wilson

ACKNOWLEDGMENTS

The authors thank Stephen D. Roberts, Bruce Schmeiser, and Arnold Sweet

for many enlightening discussions on this paper. The authors also thank Paul

Fishwick and Robert O’Keefe, the Special Issue Coeditors, and the three

anonymous referees for several suggestions that improved the readability of

this paper.

REFERENCES

CARIO, M. C. AND NELSON, B. L. 1995. Autoregressive to anything: Time series input processes

for simulation. Working Paper, Dept of Industrial, Welding and Systems Engineering, Ohio

State Univ., Columbus.

CONTII, S. D. AND DE BOOR, C. 1980. Elementary Numerical Analysls: An Algorithmic Ap-

proach, 3rd cd., McGraw-Hill, New York.DEBROTA,D J,, DITTUS,R S., ROBERTS,S. D., ANDWILSON,J. R. 1989. Visual interactive

fitting of bounded Johnson distributions, Simulation 52, 5 (May), 199-205.FARIN, G. 1990, CurLes and Surfaces for Computer Aided GeometrLc Design; A Practical

GuLde, 2nd cd., Academic Press, New York.

FLANICAN, M. A. 1993, A flexible, interactive, graphical approach to modehng stochastic inputprocesses. Ph D, dissertation, School of Industrial Engineering, Purdue Univ., West Lafayette,Ind.

JAGERMAN, D. L, AND MELAMED, B. 1992a. The transition and autocorrelation structure of TESprocesses, Part I: General theory. Comm un. Stat. Stoch, Models 8, 2, 193–2 19.

JAGF,RMAN,D. L. mm MELAMED, B. 1992b, The transition and autocorrelation structure of TES

processes, Part II: Special cases. Commun. Stat. Stoch. Models 8, 3, 499–527.

JOHNSON, N. L. 1949a. Systems of frequency curves generated by methods of translation.Btometrika 36, 149-176.

JOHNSON,N. L. 1949b. Bivariate distributions based on simple translation systems. Bzometrtka

36, 297-304,

JOI*NSON, M, E. 1987. Multiuarlate Statistical Simulation. Wiley, New York.

LEWIS, P, A. W AND ORAV, E. J. 1989. S{mulatzon Methodology for Statisticians, Operations

Anal-vsts, and Engzneers, Vol. 1. Wadsworth & Brooks/Cole Advanced Books & Software,Pacific Grove, Cahf.

MELAMED, B., Hn.L, J. R , AND GOLDSMAN, D. 1992. The TES methodology: Modeling empirical

stationary time series, In proceedings of the 1992 Winter Simulation Conference, Institute ofElectrical and Electronics Engineers. Piscataway, N.J., 135-144.

MICROSOFT CORIWRATION. 1990. User’s Manual for the Windows Software Development KLt,

Microsoft Corporation, Redmond, Wash.

NELDER, J. A. AND MEAI), R. 1964. A simplex method for function minimization, Comput, J, 7,

308-313.

OLSSON, D, M. AND NELSON, L. S. 1975. The Nelder-Mead simplex procedure for functionminimization. Technonzetrics 17, 1,45–51.

PRITSRER,A. A. B., MARTIN, D. L., REUST, J. S., WAGNER, M. A,, WILSON, J R., KUHL, M. E., DAILY,O. P., HARPER, A. M., EDWARDS, E. B., B~NNETT, L. E., ALLEN, M. D., ROBERTS, J. P,, ANDBURDJCIi, J. F. 1995. Organ transplantation policy evaluation. In Proceedings of the 1995

Winter Stmulatzon Conference. Institute of Electrical and Electronics Engineers, Plscataway,N.J. (to appear).

STANFIELD, P. M. AND WILSON, J. R, 1993. Multivariate input modeling with Johnson distribu-tions, Tech. Rep. 93-11, Dept. of Industrial Engineering, North Carolina State Univ., Raleigh.

SWAIN, J. J., VENRATRAMAN, S., AND WILSON, J. R. 1988. Least-squares estimation of distribu-tion functions in Johnson’s translation system. J. Stat. Comput. Slmul. 29, 271–297.

WAGNER, M. A. F. AND WILSON, J. R. 1995 Using univariate B6zier distributions to modelsimulation input processes. IZE Transactions (to appear).

ACM Transactions on Modeling and Computer Slmulatlon, Vol 5, No 3, July 1995,

Simulation Input Modellng . 189

WAGNER, M. A. F. AND WILSON, J. R. 1994. Using bivariate B6zier distributions to modelsimulation input processes. In Proceedings of the 1994 Winter Simulation Conference (Orlando,FL, Dec. 11-14), Institute of Electrical and Electronics Engineers, Piscataway, N.J., 324-331.

WAGNER,M. A. F. ANDWILSON,J. R. 1993. Using univariate B6zier distributions to modelsimulation input processes.In Proceedings of the 1993 Winter Simulation Conference. Institute

of Electrical and Electronics Engineers, Piscataway, N.J., 365–373.

Received August 1994; revised August 1995; accepted August 1995

ACM TransactIons on Modeling and Computer Simulation, Vol. 5, No. 3, July 1995.