4
Focus Article Dominated rejection algorithms for generating random variates Timothy Hall This focus article presents a practical modified version of the von Neumann dominated rejection method for generating univariate random variates using a distribution density function, a uniform variate over a finite interval, and an independent uniform variate over the unit interval. An example generated variate from the normal distribution family is included for demonstration purposes. © 2012 Wiley Periodicals, Inc. How to cite this article: WIREs Comput Stat 2012. doi: 10.1002/wics.1230 Keywords: dominated rejection algorithm; generated variate data; embedded systems code INTRODUCTION T he purpose of this focus article is to document the theory of a dominated rejection algorithm (DRA) that may be used to generate arbitrary random variates according to a density function alone. These methods were inspired by a version first proposed by von Neumann 1 (who referred to the algorithm as ‘rejection sampling’), as related by Knuth in Ref 2. Additional conditions on the density function f described in the following theorem are imposed in Policies section that are needed to implement the DRA in a low-level programming language, such as MMIX, for use in embedded systems. MMIX is the successor assembly level program- ming language to MIX, both of which were invented by Knuth in support of his The Art of Computer Pro- gramming (TAOCP) series and related publications. MMIX programming is documented in Professor Knuth’s definitive reference work, 3 and in Fascicle 1 of the updated TAOCP Volume 1. It provides all the functionality necessary to implement the DRA in a low-level programming environment for use in embedded systems. MAIN THEOREM von Neumann proposed an equivalent version of the following theorem as a means to access all random Correspondence to: [email protected] PQI Consulting, P. O. Box 425616, Cambridge, MA, USA variate distributions (both discrete and continuous) through the strategic use of a uniform distribution. As long as discrete or continuous uniform variate values may be efficiently generated, then those values may be used to generate any distribution variate values (under suitable regularity conditions—see Policies section). For a proof of the theorem, see Refs 2 and 4. Theorem 1 Let f : R R + and g : R R + be two random variable density functions such that 0 < f (x) kg (x) for some constant k > 0, for all x R, and let U be a uniformly distributed random variate on [0, 1]. Generate random variate X = x 0 according to the distribution given by g, and generate random variate U = u 0 , independently of X. If X = x 0 is accepted when u 0 < f (x 0 ) kg(x 0 ) and rejected otherwise, then the random variate generated by the accepted X values has a distribution given by f . A DOMINATED REJECTION ALGORITHM A DRA applies Theorem 1 to the special case where g is also a uniform random variable on a sufficiently large connected, compact subset V of the support of f . Those values of X that are necessarily excluded by being outside of V are considered so unlikely to occur in the observed values under f that their exclusion © 2012 Wiley Periodicals, Inc.

Dominated rejection algorithms for generating random variates

Embed Size (px)

Citation preview

Page 1: Dominated rejection algorithms for generating random variates

Focus Article

Dominated rejection algorithmsfor generating random variatesTimothy Hall∗

This focus article presents a practical modified version of the von Neumanndominated rejection method for generating univariate random variates using adistribution density function, a uniform variate over a finite interval, and anindependent uniform variate over the unit interval. An example generated variatefrom the normal distribution family is included for demonstration purposes. © 2012Wiley Periodicals, Inc.

How to cite this article:WIREs Comput Stat 2012. doi: 10.1002/wics.1230

Keywords: dominated rejection algorithm; generated variate data; embeddedsystems code

INTRODUCTION

The purpose of this focus article is to documentthe theory of a dominated rejection algorithm

(DRA) that may be used to generate arbitrary randomvariates according to a density function alone. Thesemethods were inspired by a version first proposedby von Neumann1 (who referred to the algorithmas ‘rejection sampling’), as related by Knuth inRef 2. Additional conditions on the density functionf described in the following theorem are imposed inPolicies section that are needed to implement the DRAin a low-level programming language, such as MMIX,for use in embedded systems.

MMIX is the successor assembly level program-ming language to MIX, both of which were inventedby Knuth in support of his The Art of Computer Pro-gramming (TAOCP) series and related publications.MMIX programming is documented in ProfessorKnuth’s definitive reference work,3 and in Fascicle 1of the updated TAOCP Volume 1. It provides allthe functionality necessary to implement the DRAin a low-level programming environment for use inembedded systems.

MAIN THEOREMvon Neumann proposed an equivalent version of thefollowing theorem as a means to access all random

∗Correspondence to: [email protected]

PQI Consulting, P. O. Box 425616, Cambridge, MA, USA

variate distributions (both discrete and continuous)through the strategic use of a uniform distribution. Aslong as discrete or continuous uniform variate valuesmay be efficiently generated, then those values may beused to generate any distribution variate values (undersuitable regularity conditions—see Policies section).For a proof of the theorem, see Refs 2 and 4.

Theorem 1 Let f : R → R+ and g : R → R

+ be tworandom variable density functions such that

0 < f (x) ≤ kg (x)

for some constant k > 0, for all x ∈ R, and let Ube a uniformly distributed random variate on [0, 1].Generate random variate X = x0 according to thedistribution given by g, and generate random variateU = u0, independently of X. If X = x0 is acceptedwhen u0 <

f (x0)

kg(x0)and rejected otherwise, then the

random variate generated by the accepted X valueshas a distribution given by f .

A DOMINATED REJECTIONALGORITHM

A DRA applies Theorem 1 to the special case whereg is also a uniform random variable on a sufficientlylarge connected, compact subset V of the support off . Those values of X that are necessarily excluded bybeing outside of V are considered so unlikely to occurin the observed values under f that their exclusion

© 2012 Wiley Per iodica ls, Inc.

Page 2: Dominated rejection algorithms for generating random variates

Focus Article wires.wiley.com/compstats

does not compromise the application in which thealgorithm is implemented. The values of X under thisform of g are generated independently of U but alsoin the same manner as U.

The compact subset V should include thosevalues of f that are most likely to occur, which are(usually) the values most likely of interest in anyimplementation.

BackgroundLet T be a random variable defined on R, with strictlypositive, mass or density function f : R → R

+, andsuppose there is an interval of interest V = [−a, b

],

where a + b > 0. In the application of Theorem 1, letg : R → R

+ be the (discrete or continuous) uniformdistribution on V, so that

g (x) = 1a + b

, − a ≤ x ≤ b.

Furthermore, let

M = max{f (x) : x ∈ V

}> 0

which exists as a real number since f is strictly positiveand either discrete or continuous on R. Then

0 < f (x) ≤ M = a + ba + b

M = (a + b

)Mg (x) = kg (x)

where k = (a + b

)M > 0.

The AlgorithmBy Theorem 1, the following steps generates a variatewith distribution given by f , which is therefore arandom variate for T.

1. Generate a uniform variate value x0 on V.

2. Generate a uniform variate value u0 on [0, 1]independently of Step 1.

3. If u0M < f (x0) then accept x0 as a variate valuefor T; otherwise, reject the value of x0.

4. Repeat from Step 1 until complete.

Step 3 follows from the fact that, in thisapplication,

f (x0)

kg (x0)= f (x0)(

a + b)

M(

1a+b

) = 1M

f (x0) ≤ 1

and M > 0.

Since a uniform variate value t0 on V may begenerated by a uniform variate value w0 on [0, 1] bythe transformation

t0 = (a + b

)w0 − a

then Steps 1 and 2 may be accomplished using the sameuniform variate generator with different initializationvalues to ensure independence.

Note that no value outside of the interval ofinterest may be generated nor accepted in this manner,since 0 ≤ w0 ≤ 1 means −a ≤ t0 ≤ b.

ITERATION COUNTS

Given the uniform variate value x0 in the DRA, theprobability that the independently chosen uniformvariate value u0 is less than f (x0)

M is itself equal to f (x0)

M .Since the probability that the value x0 is chosen is

1a+b , then the probability that the joint variate (x0, u0)

generates an accepted value for x0 is f (x0)

(a+b)M. The joint

variate may be viewed as a binomial process whose‘‘probability of success’’ is f (x0)

(a+b)M.

The expected value of choosing x0 is

∫ b

−a

1a + b

x dx = 12

(a + b

) (b2 − a2

)= 1

2

(b − a

)

so that the ‘expected value’ of the ‘probability ofsuccess’ is

f(1

2

(b − a

))(a + b

)M

which, in turn, means the expected number of acceptedvalues for x0 in N passes through the algorithm is

λ = f(1

2

(b − a

))(a + b

)M

N.

The choices of a and b that maximizef(1

2

(b − a

))while minimizing a + b drives the

maximum value of λ for a given N.

POLICIES

The DRA may be applied to any random variable Xon R under the following analytical conditions (whichfacilitates the use of low-level programming languageimplementation code).

1. The distribution of X must have a well-defined,finite, piecewise discrete or continuous densityfunction f defined on its support.

© 2012 Wiley Per iodica ls, Inc.

Page 3: Dominated rejection algorithms for generating random variates

WIREs Computational Statistics Dominated rejection algorithms

−6.0

0−5

.72

−5.4

4−5

.16

−4.8

8−4

.60

−4.3

2−4

.04

−3.7

6−3

.48

−3.2

0−2

.92

−2.6

4−2

.36

−2.0

8−1

.80

−1.5

2−1

.24

−0.9

6−0

.68

−0.4

0−0

.12

0.16

0.44

0.72

1.00

1.28

1.56

1.84

2.12

2.40

2.68

2.96

3.24

3.52

3.80

4.08

4.36

4.64

4.92

5.20

5.48

5.76

Value

1400

1200

1000

800

600

400

200

0

Fre

quen

cy

Standard normal variate

FIGURE 1 | Standard normal generated variate values.

2. The function f must have a finite second centralmoment (called the variance σ 2

X, which means italso has a finite first central moment, called themean μX).

3. All intervals of interest are based only on μXand σX.

4. The analytical methods used to generate uniformvariate values conform to the predefinedpolicies, specifications, and requirements thatare independently established for the DRA.

5. The required number of accepted values X = x0is determined and fixed before the algorithm isapplied.

6. Exceptions to these policies are allowed asrequired by the particular circumstances of theimplementation.

EXAMPLE: THE STANDARD NORMALRANDOM VARIATEConsistent with the prototype MMIX implementationcode, consider the standard normal random variableX with density function

f (x) = 1√2π

e− 12 x2

.

We have

μX = 0 < ∞ and σX = 1 < ∞

and

ddx

f (x) = 1√2π

(−x) e− 12 x2 = 0

means that x = 0 is the only critical value. Then

d2

dx2 f (x)

∣∣∣∣x=0

=(

1√2π

x2e− 12 x2 − 1√

2πe− 1

2 x2)∣∣∣∣

x=0

= − 1√2π

< 0

means x = 0 is a global maximum for f on R. Thismeans M = f (0) = 1√

2π.

Therefore,

f(1

2

(b − a

))(a + b

) = 1√2π

(a + b

)e− 18 (b−a)

2

so that

∂a

f(1

2

(b − a

))(a + b

)

=

⎛⎜⎝

14

(b − a

) 1√2π(a+b)

e− 18 (b−a)

2

− 1√2π(a+b)

2 e− 18 (b−a)

2

⎞⎟⎠ = 0

© 2012 Wiley Per iodica ls, Inc.

Page 4: Dominated rejection algorithms for generating random variates

Focus Article wires.wiley.com/compstats

∂b

f(1

2

(b − a

))(a + b

)

=

⎛⎜⎝ −1

4

(b − a

) 1√2π(a+b)

e− 18 (b−a)

2

− 1√2π(a+b)

2 e− 18 (b−a)

2

⎞⎟⎠ = 0

which means

12

(b − a

) 1√2π

(a + b

)e− 18 (b−a)

2 = 0

or

a = b.

Hence, all intervals of interest V should be symmetricabout the mean 0, i.e., of the form V = [−b, b

], and

this gives

λ = f(1

2

(b − a

))(a + b

)M

N =(

f (0)

M

)N2b

= N2b

.

The smaller b > 0 becomes, the larger λ becomes, andthe larger b becomes, the smaller λ becomes. Thistrade off then determines the value of b (and a). If K

accepted values are needed, the value of b should bechosen so that the interval

[−b, b]

minimally coversall values of interest and that 2bK is not unacceptablyhigh.

Figure 1 depicts a histogram of K = 300, 000for b = 6. In this case, approximately N = 2bK =2 (6) (300, 000) = 3, 600, 000 passes through thealgorithm were required. Even though this may appearto be an excessive number of iterations, such calcu-lations may be affected in a low-level programminglanguage environment and consume only a few pro-cessing cycles per iteration.

CONCLUSION

This focus article presents the theory, calculationmethods, and implementation details for a DRA thatprovides for the calculation of an arbitrary discrete orcontinuous distribution random variate through theuse of a discrete or continuous uniform random vari-ate. Iteration count estimates were provided, and poli-cies were proposed for implementing the algorithm ina low-level programming language environment foruse in embedded systems. See Ref 4 for further detailson the algorithm and its implementation.

REFERENCES1. von Neumann J. Various techniques used in connection

with random digits. Applied Mathematics Series: MonteCarlo Methods, vol. 12. Washington, DC: NationalBureau of Standards; 1951:36–38.

2. Knuth DE. The Art of Computer Programming, Seminu-merical Algorithms, vol. 2, 3rd ed. Reading, MA:Addison-Wesley; 1998.

3. Knuth DE. MMIXware: A RISC Computer for theThird Millennium. Heidelberg, Germany: Springer-Verlag; 1999.

4. Hall T. A dominated rejection algorithm for generatingrandom variates. Proceedings of the 2011 Joint StatisticalMeetings, Miami Beach, FL, 2011.

© 2012 Wiley Per iodica ls, Inc.