Random Variate Generationpeople.cs.pitt.edu/~lipschultz/cs1538/06_random_variates.pdf · Random...

Random Variate Generation

CS1538: Introduction to Simulations

Random Variate Generation

Once we have obtained / created and verified a quality

random number generator for U[0,1), we can use that to

obtain random values in other distributions

Ex: Exponential, Normal, etc.

There are several techniques for generating random

variates

Some are more efficient than others

Some are easier to implement than others

Methods for Generating Random Variates

Method 1: Inverse Transform

Method 2: Accept/Reject

Method 3: Special Properties

Method #1: Inverse Transform Technique

Applicable to distributions with a closed mathematical

formula

There is some function F-1 that maps values from U[0,1)

into the desired distribution

Inverse Transform

Suppose we want to sample from some continuous

distribution with CDF F(x) = Pr(X≤ x).

We want to know the inverse of F(x), F-1(x).

That is, F(F-1(x)) = F-1(F(x)) = x

Example: x2 and sqrt(x) are inverses of each other

Because we know that 0 ≤ F(x) ≤ 1, the outcome of F(x)

can be represented by a draw from U[0,1), call that R.

If we know F-1(x), we can get a sample from the desired

distribution by calculating F-1(R).

Inverse Transform

General strategy for finding F-1(x):

Begin with: y = F(x)

Go through some algebra to write the equation in terms of

x = G(y)

G(y) is basically the F-1(x) we’re looking for.

Example: Exponential Distribution

CDF for Exponential Distribution:

We want to find F-1(x) such that F-1(F(x)) = x

F-1(x) = (-1/l) ln(1-x)

xxF xl

Example: Exponential Distribution

CDF for Exponential Distribution:

We want to find F-1(x) such that F-1(F(x)) = x

F-1(x) = (-1/l) ln(1-x)

Implement in code:

def rand_expon(lambda):

R = rand()

return -ln(1-R)/lambda

xxF xl

Example: Continuous Uniform Distributions

on a Different Range

Suppose we want to draw from U[a,b)

Intuitively, we know how to get the f(x) and F(x) of this

distribution from the standard uniform one:

Scale the range from 0,1 to match a,b

Shift to get a different starting point

If RU[0,1), we’d get our sample with X a + (b-a)R

How do you use the Inverse Transform Method?

f(x) = 1

𝑏−𝑎, when a ≤ x ≤ b

F(x) = 𝑥−𝑎

𝑏−𝑎, when a ≤ x ≤ b

Example:

Empirical Continuous Distribution

Suppose we observe n numbers, and we know that they

came from a continuous distribution

E.g.: observed 5 response times to incoming alarms

Sort the numbers in increasing order: x(1)…x(n)

Let x(0) be the known minimum of f(x)

Define n (unequal) intervals: (x(i-1),x(i)) for i =1..n

But the chance of landing in each interval is even: 1/n

We can plot the CDF of the empirical distribution as the

piecemeal of line segments that connect the coordinates

(x(i), i/n) for i=0..n.

Example: Empirical Continuous

Distribution (continued)

For the ith segment (i=1..n):

F(x) = mi(x-x(i-1)) + (i-1)/n,

where mi = (1/n) / (x(i) – x(i-1))

Let ai be the inverse of mi

F-1(x) = x(i-1) + ai(x – (i-1)/n),

for (i-1)/n ≤ x ≤ i/n.

So the procedure is:

Draw R from U[0,1).

Figure out the interval that R

belongs:

F(x(i-1)) ≤ R ≤ F(x(i))

return F-1(R)

x(1) x(2) x(3) x(4)x(0)

Example:

Empirical Continuous Distribution

Example: Five observations of fire-crew response times (in

minutes) to incoming alarms have been collected

2.76, 1.83, 0.80, 1.45, 1.24

Use these observations to form an empirical continuous

distribution. Use the empirical distribution to generate a

random variate from it.

Let R = 0.71 be generated from U[0, 1)

Method #2: Accept/Reject (Idea)

Suppose we want to sample from some continuous

distribution f(x) that is defined between interval (a,b), but

we can’t easily compute the inverse CDF, F-1(x).

We can sample uniformly from within the bounding

rectangle of f(x). We accept if the point falls under f(x)

and reject otherwise.

Image from: Peter Haas, MS&E223, Stanford

Accept/Reject (Simple version)

Let m be the maximum value that f(x) can take

Step 1: Get two (independent) draws from U[0,1): U1, U2

Step 2: we need to transform the coordinates (U1, U2)

from the unit square to bounding box:

R1 = a + (b-a) U1

R2 = mU2

Step 3:

If R2 <= f(R1): return R1 // success, so accept

Otherwise, go back to Step 1 // reject

Example

𝑓 𝑥 =

30 ≤ 𝑥 ≤ 2

2 3−𝑥

32 < 𝑥 ≤ 3

0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Try (U1, U2) = (0.3, 0.8)

Try (U1, U2) = (0.7, 0.6)

Accept/Reject

Advantage: Doesn’t require figuring out the inverse CDF

Disadvantage: may have to sample a lot to get an accept

How many samples do we expect it to take? This can be

described as a geometric distribution

Generalized Accept/Reject (Idea)

We don’t actually have to use the uniform distribution

along the x-axis

More generally, we can approximate f(x) with a

distribution that is similar to it, say g(x), if we can satisfy

some requirements

Generalized Accept/Reject (Requirement)

We know the density of g(x)

We can easily sample from g(x)

There is some constant c such that f(x) ≤ c*g(x) for all x

Generalized Accept/Reject Algorithm

Figure out c: find max of f(x)/g(x) for all x

Step 1: X draw a sample from g(x)

Step 2: R draw a sample from U[0,1)

Step 3: If R*c*g(X) ≤ f(X), then return X

Otherwise go back to step 1

On average, how many times before an accept?

Like before, we can use the Geometric distribution for our

analysis, using the chance of accept as the chance of success

So we see that the closer f(x) is to g(x), the smaller the

number of trials (# rejects 0 as c 1)

Example

Suppose the distribution we want has the following pdf:

f(x) = Kx1/2e-x, where K=2/sqrt(𝜋), x > 0

f(x) is the pdf of a Gamma distribution with parameters (3/2, 1)

We choose g(x) to be:

What distribution have we already seen that looks similar to f(x)?

Now, figure out c

Find max of: f(x)/g(x)

Take the derivative and set to 0, then solve for x. (Answer: x = 3/2)

So c = f(3/2)/g(3/2) = 33/2/(2𝜋e)1/2 ~ 1.257

Example continued

Once we know c, we can define f(x)/(c*g(x))

f(x)/(c*g(x)) = (2e/3)1/2x1/2e-x/3

def gen_fx:

repeat until accept:

X gen_exponential(2/3) // draw from g(x)

R U[0,1) // draw from uniform

if R ≤ (2e/3)1/2X1/2e-X/3: // if R*c*g(X) ≤ f(X)

return X // accept

Method 3: Make Use of Special Properties of

the Distribution

Composite method

Example: Binomial is a sum of Bernoulli trials

Example: Erlang is a sum of Exponentials

Polar co-ordinates method

Normal Distribution

Exploit Relationships

Poisson has a relationship with Exponential

Composite Method: Erlang Distribution

Erlang distribution with parameters k, q can be thought of as the sum of k independent exponential random variables, each with the same rate l = kq

Since we can calculate the exponential distribution using the inverse transform technique, we should be able to easily also generate an Erlang distribution

def rand_erlang(k, theta):

lambda = k*theta

for i = 1..k:

tot += rand_expdistr(lambda)

return tot

Efficiency: Erlang Distribution

If we rely on drawing exponential k times:

𝑖=1𝑘 𝑟𝑎𝑛𝑑_exp(𝜆) = 𝑖=1

𝑘 (−1

𝜆ln 𝑅𝑖)

This requires performing ln(x), an expensive operation, k times

To get around this, we can exploit this logarithmetic property:

log(ab) = log(a) + log(b)

So the above can be rewritten as:

𝜆 𝑖=1𝑘 (ln 𝑅𝑖)= −

𝜆ln( 𝑖=1

𝑘 𝑅𝑖)

This requires only one log operation (but with k multiplications as

opposed to k additions).

Composite Method: Binomial Distribution

Given a random number generator for the Bernoulli

distribution, how might we generate a random number

generator for the Binomial distribution?

Normal

Good news: If we can sample from N(0,1), we can transform

the outcome for an arbitrary normal.

N(0,1) has pdf f(x) = 2/sqrt(2p) * exp(-x2/2)

Bad news: Its CDF doesn’t have a convenient closed form

Can’t do Inverse Transform (unless we approximate)

Can do generalized accept/rejection

Let g(x) be the an exponential distribution with l=1

Then c= sqrt(2e/p)

Not as efficient

Preferred method: Polar Coordinate Method

Sampling from N(0,1)

Suppose we draw two samples from N(0,1). Let Z1 be the

random variable for the first draw and Z2 be the random

variable for the second draw. On the (Z1, Z2) coordinate

system, we have

Z1 = H cos(q)

Z2 = H sin(q)

H2 = Z12 + Z2

2 has c2 distribution with 2 degrees of freedom

which is equivalent to an exponential distribution with mean 2

(λ=1/2).

If we can generate H and q easily, we can use them to

make Z1and Z2

Sampling from N(0,1), Naïve Method

We know how to sample for H since H2 is an exponential

H = (-2 ln(U1))1/2, where U1U[0,1)

To sample q, we pick uniformly from [0,2 𝜋)

So q = 2πU2, where U2U[0,1)

Then, we’d get

Z1= (-2 ln(U1))½cos(2πU2)

Z2= (-2 ln(U1))½sin(2πU2)

This method gives us two random draws. If you only need

one, just toss the other one.

Disadvantage: sin, cos are expensive operations

Sampling from N(0,1), Improved Method

Idea: rather than using one uniform distribution to come up with the angle θ, use two uniform random numbers and trig properties:

V1 U[-1, 1)

V2 U[-1, 1)

Let S2 = V12 + V2

Then cos(θ) = V1 / S and sin(θ) = V2 / S

Z1= (-2 ln(U1))½ cos(2πU2) = (-2 ln(U1))

½ V1/ S

Z2= (-2 ln(U1))½ sin(2πU2) = (-2 ln(U1))

½ V2/ S U1U[0,1)

Any inefficiencies with this?

Sampling from N(0,1), Efficient Method

We can actually do away with the first uniform draw.

Suppose we reject drawings of V1 and V2 if S2 >= 1.

Then (V1, V2) is uniformly distributed in a circle with radius 1.

So S2 = V12 + V2

2 is uniformly distributed between [0, 1).

Let U = S2, then

Z1= (-2 ln(U))½ V1/ U1/2 = V1(-2 ln(U) / U)1/2

Z2= (-2 ln(U))½ V2/ U1/2 = V2(-2 ln(U) / U)1/2

Sampling from N(0, 1)

def box_muller:

repeat until accept:

V1 U[-1, 1)

V2 U[-1, 1)

U = V12 + V2

if U < 1: accept

Z1= V1(-2 ln(U) / U)1/2

Z2= V2(-2 ln(U) / U)1/2

return (Z1, Z2)

Named after the authors of the paper describing this method [Box & Muller, 58]

Other, more efficient techniques, but this is simple to compute

Sampling from N(µ, σ2)

How do you transform from N(0, 1) to N(µ, σ2)?

Generating a (homogenous) Poisson Process

Suppose we want to simulate customers arriving at a

take-out stand with the Poisson Process. Let the rate of

arrival (λ) be 5 customers per hour. The stand opens at

10am and closes at 3pm.

How do we write a function that generates customer

arrival time?

def gen_poisson_proc1(λ, T):

q new queue

while t < T:

t += gen_exponential(λ)

if t < T: q.enqueue(t)

return q

def gen_poisson_proc2 (λ, T):

q new queue

N gen_poisson_dist(λ, T)

for i from 1 to N:

q.enqueue(U[0,1)*T)

sort q

return q

def gen_poisson_proc1(λ, T):

q new queue

while t < T:

t += gen_exponential(λ)

if t < T: q.enqueue(t)

return q

def gen_poisson_proc2 (λ, T):

q new queue

N gen_poisson_dist(λ, T)

for i from 1 to N:

q.enqueue(U[0,1)*T)

sort q

return q

How to generate a Poisson Random Variate?

Inverse Transform method?

Pr 𝑋 ≤ 𝑥 = 𝑖=0𝑥 𝑒−𝛼𝛼𝑥

What’s the inverse of that?

Use Accept/Reject Algorithm

Idea: Generate exponential inter-arrival times (Ai’s) until arrival

n+1 occurs after desired time interval (T)

A1 + A2 + … + An ≤ T < A1 + A2 + … + An + An+1

Take T=1 and ln(R) instead of ln(1-R) for ease of calculation

𝐴𝑖 = −1

𝛼ln 𝑅𝑖

𝑖=1𝑛 −

𝛼ln 𝑅𝑖 ≤ 1 < 𝑖=1

𝑛+1−1

𝛼ln 𝑅𝑖

Multiply through by –α

𝑖=1𝑛 ln 𝑅𝑖 ≥ −𝛼 > 𝑖=1

𝑛+1 ln 𝑅𝑖 Sum of logs = log of products

ln 𝑖=1𝑛 𝑅𝑖 ≥ −𝛼 > ln( 𝑖=1

𝑛+1𝑅𝑖)

exp(ln(x)) = x

𝑖=1𝑛 𝑅𝑖 ≥ 𝑒

−𝛼 > 𝑖=1𝑛+1𝑅𝑖

A1 + A2 + … + An ≤ T < A1 + A2 + … + An + An+1

After some transformations and simplifying assumptions:

𝑖=1𝑛 𝑅𝑖 ≥ 𝑒

−𝛼 > 𝑖=1𝑛+1𝑅𝑖

We need to identify the n that satisfies the inequality

Find smallest n such that: 𝑒−𝛼 > 𝑖=1𝑛+1𝑅𝑖

Take Pn = 𝑖=1𝑛+1𝑅𝑖

Generating Poisson Random Variate

gen_poisson_dist(α)

while True:

Rn+1 U[0, 1)

P P*Rn+1

if P < e-α:

Return n

n n + 1

How do you generate Poisson RVs according to parameters λ, T?

Poisson Random Variate Example

Buses arrive at a bus stop according to a Poisson process with

a mean of one bus per 15 minutes. Generate a random variate

N which represents the number of arriving buses during a 1-

hour time slot.

n Rn+1 P Accept/Reject Result

0 0.4357

1 0.4146

2 0.8353

3 0.9952

4 0.8004

5 0.7945

6 0.1530

Poisson Random Variate Example (Comments)

What if a stop has multiple buses stop at it (e.g. Fifth & Bigelow for inbound buses)?

Larger values of α lower the accept criteria, generally requiring more random numbers

E[Random numbers generated before accept] = E[N+1]= E[N] + E[1] = α + 1

Generating m random variates takes approx. m*(α + 1) random numbers

Note that P is strictly decreasing

For large α (α >= 15), generating RV’s becomes expensive, can use an approximation method:

𝑍 =𝑁 − 𝛼

𝛼approximately follows N(0, 1) when α is large

Generate a standard normal variate Z

𝑁 = 𝛼 + 𝛼𝑍 − 0.5

Nonstationary Poisson Process

Arrival rate may not be constant over time of interest

Customers at take-out may have peak arrival at noon and

much fewer at 10am

Idea 1: Take average arrival over entire time

Fails to accurately model arrivals: will have fewer arrivals during

peak times and more arrivals during lulls

Idea 2: For arrival at time t, time of next arrival is at

t + rand_expon(λt)

Nonstationary Poisson Process

Idea 2: For arrival at time t, time of next arrival is at

t + rand_expon(λt)

If arrival occurs at t = 5 (when λt is low), then rand_expon(λt)

could be large (because 1/λt is large)

Might miss first peak around t = 9

Image from http://web.ics.purdue.edu/~hwan/IE680/Lectures/Chap08Slides.pdf

Nonstationary Poisson Process Idea 3: Generate Poisson arrival process at fastest rate, then accept only a

portion of arrivals to get desired rate. Thinning method (an accept/reject method)

def gen_nonstat_poisson_proc(λ(t), T, N):

λ* = max0≤t≤T λ(t) //the max arrival rate

q new queue

while q.length < N:

E rand_expon(λ*)

t += E

R U[0, 1)

if R ≤ λ(t) / λ*:

q.enqueue(t)

return q

Nonstationary Poisson Process Example

(Ex 8.10 from Banks et al) Using the table below, generate the

first two arrival times.

t (min) Mean time between arrivals (min) Arrival Rate (λ(t))

[0, 60) 15 1/15

[60, 120) 12 1/12

[120, 180) 7 1/7

[180, 240) 5 1/5

[240, 300) 8 1/8

[300, 360) 10 1/10

[360, 420) 15 1/15

[420, 480) 20 1/20

[480, 540) 20 1/20

Take E’s from: 13.13, 2.96, 46.05, 28.22, 33.97

Take R’s from: 0.8830, 0.0240, 0.0001, 0.8019

Random Variate Generationpeople.cs.pitt.edu/~lipschultz/cs1538/06_random_variates.pdf · Random...

Documents

Nonparametric Random Variate Generation Using a …leemis/2012commst1.pdf · Nonparametric Random Variate Generation Using a ... One potential weakness that arises with the piecewise-linear

Chapter 8 Random-Variate Generation

04-Random-Variate Generation.ppt

Unit 3 random number generation, random-variate generation

Chapter 8 Random-Variate Generation - الصفحات الشخصيةsite.iugaza.edu.ps/aschokry/files/2019/09/chapter8done.pdf · 2019-12-26 · Random variate generation Random variate

Modeling and Simulation of Bivariate Gaussian Random Fields · PDF fileGaussian random elds, multivariate geostatistics, bivariate powered exponential model, bi-variate stable model,

Class 03 Random Variate Generation

5th Unit Random Variate Generation

Random variate generators for the Poisson-Poisson and ...luc.devroye.org/devroye_1989_poisson_related_distributions.pdf · A discrete distribution that has received quite a bit of

Transformation of Circular Random Variables Based …...entropy, random variate generation, ﬁnite mixture and modality properties. In particular, we shall focus our attention on

Chapter Two GENERAL PRINCIPLES IN RANDOM VARIATE GENERATIONluc.devroye.org/chapter_two.pdf · Chapter Two GENERAL PRINCIPLES IN RANDOM VARIATE GENERATION 1. INTRODUCTION. In thls

Random Variate Generation (44 slides)

Random Variate Generation - cse.wustl.edujain/cse567-08/ftp/k_28rvg.pdf · Exponential inter-arrival times ⇒ Poisson number of arrivals ⇒ Continuously generate exponential variates

Simulation Example: Generate a distribution for the random variate: What is the approximate probability that you will draw X ≤ 1.5?

ERROR DETECTION IN NON-UNIFORM RANDOM VARIATESstatmath.wu.ac.at/research/talks/resources/Chaudhuri.pdf · Non-Uniform Random Variate Generation Usually generated by transforming sequence

FFT-Based Simulation of Multi-Variate Nonstationary Random Processesdownload.xuebalib.com/xuebalib.com.33859.pdf · non stationary random processes. Introduction The simulation of

Section 6.2: Generating Discrete Random Variates...Random Variate Generation By Inversion X is a discrete random variable with idf F∗(·) Continuous random variable U is Uniform(0,1)

Random Number Generation - University of Pittsburghpeople.cs.pitt.edu/~lipschultz/cs1538/05_random_numbers.pdfPseudo-Random Number Generation: Sampling from a true random source may

VIII SEMESTER SOFTWARE ARCHITECTURES Subject Code: … · Random-Number Generation, Random-Variate Generation: Properties of random numbers; Generation of pseudo-random numbers; Techniques

Random-Number Generationjain/cse567-08/ftp/k_26rng.pdf · Random-Number Generation! Random Number = Uniform (0, 1)! Random Variate = Other distributions = Function(Random number)