Upload
martin-kretzer
View
2.592
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Chair of Information Systems IV (ERIS) Institute for Enterprise Systems (InES)
16 April 2013, 10.15 am – 11.15 am
Martin Kretzer
Phone: +49 621 181 3276E-Mail: [email protected]
Generating Continuous Random Variables(IS 802 “Simulation”, Section 3)
Agenda 2
Agenda
1 Introduction
2 Selected Methods
3 Generating Important Distributions
4 Summary
5 Exercises (Simulations using R)
Generating Continuous Random Variables
Why “Generation of Continuous Random Variables”? 3
Simulations are able to address magnifold questions, e.g.: How many customers will we have? How long does it take to handle arriving customers? Will queues develop?
Problem: Simulations require random variables Uniformly distributed random variables are often not enough, but can be used for generation of further distributions Uniformly distributed random variables can be generated, e.g., modulo function (section 2) Example: Customers in a store arrive randomly, between 10 and 20 minutes apart, normally / exponentially
distributed on this interval.
Goal: Generate random variables Discrete random variables (section 2) Continuous random variables (univariate) (section 3)
Each method for generating a discrete random variable has its analogue in the continuous case! Multivariate normal distributed variables (section 4)
Generating Continuous Random Variables
Discrete Variables Continuous Variables Continuous distribution function Example: temperature Univariate
One predicting variable () Multivariate
Multiple predicting variables (,…)
Finite set of values Example: population 0, 1, 2, 3, …
Learning Outcomes 4
Generating Continuous Random Variables
Generate random variables using the Inverse-Transform and Acceptance-Rejection Method
Develop algorithms for simulating Exponential, Normal, Poisson and Nonhomogeneous Poisson distributions
Perform simulations using R
Agenda 5
Agenda
1 Introduction
2 Selected Methods
2.1 Inverse Transform Method
2.2 Acceptance Rejection Method
3 Generating Important Distributions
4 Summary
5 Exercises (Simulations using R)
Generating Continuous Random Variables
Prerequisites The uniform (0,1) random variable The continuous, cumulative distribution function of the targeted random variable
Inverse Transform Method Definition:
Note: The superscript “-1” is not an exponent! It indicates the inverse! Transformation:
is defined to be that value of such that Proof:
is the distribution function of
– * because U is uniform has the same distribution as A random variable can be generated from the continuous, cumulative distribution function be generating a
random number and then setting
Inverse Transform Method: Proof 6
Generating Continuous Random Variables
Inverse Transform Method:1. Find a formula for the function 2. Generate a uniform random number 3. Return the random number
Inverse Transform Method: Example 7
Example Task: Generate a random variable with
cumulative distribution function
1. Find a formula for the function
2. Formulate the algorithm for generating the random variable
Generate a random number and Return
Limitation of the Inverse Transform Method Cumulative distribution function needs to be
invertible Thus the inverse transform method is not
suited for the normal distribution!
Generating Continuous Random Variables
3xxF
31 uuF
3n
Acceptance-Rejection Method 8
Difference to section 2 (discrete random variables): mass functions are replaced by densities
Prerequisites The continuous probability distribution function of the targeted random variable A method for generating a random variable having density function ; needs to be defined on the same interval as
Acceptance-Rejection Method is a constant such that (for all ) Theorem
has density Number of required iterations is a geometric random variable with mean
Proof See section 2 (discrete random variables)
Efficiency The smallest possible represents the average number of iterations
Generating Continuous Random Variables
Acceptance-Rejection Method:1. Generate having density 2. Accept the generated value with a probability proportional to
2.1 Generate a random number 2.2 If set . Otherwise go to step 1.
Acceptance-Rejection Method: Example 9
Example Task: Give the rejection algorithm that generates having
Use
1. Generate a rejection procedure 1. Determine the smallest such that
1.1 Determine maximum of – 1.1.1 Differentiation yields – 1.1.2 Setting this equal to shows that maximum for
1.2 Determine smallest c– 1.2.1 – 1.2.2
2. Formulate the rejection algorithm 1. Generate random numbers and 2. If stop and set ; otherwise go to step 1
Average number of iterations =
Generating Continuous Random Variables
3120 xxxf
Agenda 10
Agenda
1 Introduction
2 Selected Methods
3 Generating Important Distributions
3.1 Exponential Distribution
3.2 Normal Distribution
3.3 Poisson Distribution
3.4 Nonhomogeneous Poisson Distribution
4 Summary
5 Exercises (Simulations using R)
Generating Continuous Random Variables
Exponential Distribution (1/2) 11
Example task: Generate an exponential random variable Exponential distribution:
Inverse Transform Method
1. Find a formula for the function
2. Formulate the algorithm for generating the random variable 1. Generate a random number 2. Return
Generating Continuous Random Variables
since is uniform(0,1), (1- ) is also uniform(0,1) and has the same distribution
Exponential Distribution (2/2) 12
Generating Continuous Random Variables
xexF
uuF ln1
11
𝜆=1
Normal Distribution (1/4) 13
Generating Continuous Random Variables
Example 2: Normal Distribution Give the rejection algorithm that generates a sequence of normal random variables with mean and variance Approach: Generate a sequence of half-normal distributed random variables (also called absolute standard
distributed random variables) and determine each variable‘s sign randomly Half-Normal distribution: only positive values!
Optimize the algorithm and assess its efficiency Acceptance-Rejection Method Set 1. Generate a rejection procedure
1. Determine the smallest such that 1.1 Determine maximum of
– 1.1.1 has its maximum if has its maximum– 1.1.1 Differentiation yields – 1.1.2 Setting this equal to shows that maximum for
1.2 Determine smallest c– 1.2.1 – 1.2.2
Normal Distribution (2/4) 14
Generating Continuous Random Variables
2. Formulate the rejection algorithm 0. 1. Generate an exponential random variable with rate 1 called 2. Generate a random number 3. If stop and set
Otherwise go to step 1 4. Generate a random number and set
5. Set and go to step 1
5.0;
5.0;*
22
22
UY
UYX i
5.0;
5.0;*
1
1
UY
UYX i
Normal Distribution (3/4) 15
Generating Continuous Random Variables
3. Optimization Transformations
– * because is exponential with rate 1 (see exponential distribution) Since and are exponentials with rate 1, is also an exponential with rate 1
Optimized Algorithm 0. 1. Generate an exponential random variables with rate 1 called 2. Generate an exponential random variables with rate 1 called 3. If stop and set
– Otherwise go to step 1 4. Generate a random number U and set
5. Set and set and go to step 2 4. Efficiency assessment
Average number of required exponential random variables Average number of iterations (step 1 and 2) (geometrically distributed) Usage of exponential : Average number of iterations
Average number of required squares (step 3):
Normal Distribution (4/4) 16
Graphical illustration
Generating Continuous Random Variables
)(
)(
xg
xfxh
2
2
2
2 x
exf
xexg
Example task: Generate the first time units of a Poisson process having rate Approach: Use the exponential distribution to generate event times (so-called interarrival times) and stop when
their sum exceeds
Inverse Transform Method
1. Find a formula for the function We already did that (see exponential distribution)
2. Formulate the algorithm for generating the Poisson process 0. 1. Generate 2.
If , stop! 3. 4. Go to step 1
The solution to the task is the sequence of the event times to
Poisson Process 17
Generating Continuous Random Variables
Legend: = first time units = time = number of events that has occurred by time
( the final values represent the number of events that occurred by time )
= event times in increasing order( represents the most recent event time)
Comparison: Homogeneous and Nonhomogeneous Poisson Process 18
Generating Continuous Random Variables
Nonhomogeneous Poisson Process
Events occur randomly in time
Expected arrival rates vary with time
The rate , which represents the expected number of events, is not constant
The intensity function represents the expected number of events around the time
(Homogeneous) Poisson Process
Events are as likely to occur in all intervals of equal size
The rate , which represents the expected number of events, is constant
Previously: Now:
Nonhomogeneous Poisson Process (1/2) 19
Example task: Generate the first time units of a nonhomogeneous Poisson process with intensity function Option 1: Thinning (also called random sampling) Option 2: Successive event times ( backup slides)
Inverse Transform Method
Option 1: Thinning 1. Simulate a Poisson process 2. Randomly count its events Remaining events will be nonhomogeneous
Formulate the algorithm for generating the Poisson process 0. 1. Generate a random number 2. Set and stop if 3. Generate another random number 4. If, set 5. Go to step 1
Generating Continuous Random Variables
Legend: = first time units = time = number of events that has occurred
by time = most recent event time = intensity function = expected
number of events around t;
simulation of a Poisson process
randomly counting events
Nonhomogeneous Poisson Process (2/2) 20
Why is this option called “thinning”? Not all simulated evens are counted Events are only counted randomly, which “thins” the (homogeneous) Poisson process
Efficiency Rule: The more events are counted, the more efficient the thinning approach Thinning is most efficient, if , because then almost all events are counted Improvement ( backup slides):
1. Break up the interval into subintervals 2. Perform the thinning approach over each subinterval
Generating Continuous Random Variables
Agenda 21
Agenda
1 Introduction
2 Selected Methods
3 Generating Important Distributions
4 Summary
5 Exercises (Simulations using R)
Generating Continuous Random Variables
If the targeted distribution function is invertible, the Inverse Transform method can and should be used
For the Acceptance-Rejection method you always have to find a suitable function first
Summary 22
Generating Continuous Random Variables
Distribution Inverse Transform Method
Acceptance-Rejection Method
Exponential X
Normal X
(Homogeneous)Poisson X
Nonhomogeneous Poisson X
Agenda 23
Agenda
1 Introduction
2 Selected Methods
3 Generating Important Distributions
4 Summary
5 Exercises (Simulations using R)
5.1 Exercise 1 – Inverse Transform Method (Exponential Distr.)
5.2 Exercise 2 – Acceptance Rejection Method (Normal Distr.)
Generating Continuous Random Variables
Exercise 1 – Task 24
Required knowledge Inverse Transform Method Exponential Distribution
Task A casualty insurance company has 1,000 policyholders, each of whom will independently present
a claim in the next month with probability .05. The amount of the claims made are independent exponential random variables with mean $800. Use simulation to estimate the probability that the sum of these claims exceeds $50,000.
Why do we have to use simulation? The expected sum of claims will be normally distributed with mean $40,000 (), due to the strong law of large
numbers However, we do not know the mean and standard deviation
Recap: Algorithm for generating an exponential random variable 1. Generate a random number 2. Return
Generating Continuous Random Variables
Exercise 1 – Live Demonstration 25
R Version: 3.0.0 (Windows 32-bit) http://cran.rstudio.com/
RStudio Version: 0.97.336 http://www.rstudio.com/ide/download/
Generating Continuous Random Variables
Exercise 1 – Code 26
####################################### Author: Martin Kretzer### Date: 11 April 2012### Descr: Simulation of an exponential distribution + inverse transform method####################################### 0. Functions:
### Performs simulations and returns a list of all resulting valuesgetSimulationsVector<- function( numberOfSimulations, numberOfCustomers, mean, probability ){ lambda<- 1/mean simulationsVector<- c() for( i in 1:numberOfSimulations ) { claimsPerSimulationVector<- c() #uniformVars<- runif( numberOfCustomers ) for( j in 1:numberOfCustomers ) { uniformVarProbability <- runif(1) if( uniformVarProbability <= probability ) { uniformVarAlgorithm <- runif(1) claim<- -(1/lambda)*log( uniformVarAlgorithm ) claimsPerSimulationVector<- c( claimsPerSimulationVector, claim ) } } simulationsVector<- c(simulationsVector, sum(claimsPerSimulationVector) ) } return(simulationsVector) };
Generating Continuous Random Variables
### Probability that all values are above the variable limitgetProbability<- function( simulationsVector, limit ){ eventcount<- 0 for( i in 1:length( simulationsVector ) ) { if( simulationsVector[[i]] >= limit ) { eventcount<- eventcount + 1 } } probability<- eventcount / length( simulationsVector ) return(probability)};
####################################### 1. Set constant variables:numberOfSimulations<- 100000;numberOfCustomers<- 1000mean<- 800;limit<- 50000;probability<- 0.05;
### 2. Execute and output variables:simulationsVector <- getSimulationsVector( numberOfSimulations, numberOfCustomers, mean, probability );simulationsVector; #R maximum outputs 10000 values
### 3. Execute and output variables:hist(simulationsVector, 250, xlab="Sum of all claims",ylab="Simulations",main="Exercise: Transform Method and Exp. Distribution");
### 4. Calculate and output probability:probability<- getProbability( simulationsVector, limit );probability;
Exercise 1 – Solution 27
Executed simulations: 100,000
Estimated probability that the sum of claims exceeds $50,000:
Generating Continuous Random Variables
5.0;
5.0;*
1
1
UY
UYX i
Exercise 2 – Task 28
Required knowledge Acceptance-Rejection Method Normal distribution
Task Write a program that efficiently generates normal random variables using the acceptance-rejection
method with (real) mean 10 and (real) standard deviation 3. What is your estimated mean and what is your estimated standard deviation?
Recap: (Optimized) Algorithm for generating a normal distribution 0. 1. Generate an exponential random variables with rate 1 called 2. Generate an exponential random variables with rate 1 called 3. If stop and set
Otherwise go to step 1 4. Generate a random number U and set
5. Set and set and go to step 2
Generating Continuous Random Variables
Exercise 2 – Live Demonstration 29
R Version: 3.0.0 (Windows 32-bit) http://cran.rstudio.com/
RStudio Version: 0.97.336 http://www.rstudio.com/ide/download/
Generating Continuous Random Variables
Exercise 2 – Code 30
####################################### Author: Martin Kretzer### Date: 12 April 2012### Descr: Simulation of a half-normal distribution + accept/reject####################################### 0. Function:
### Performs simulations and returns a vector of all simulated valuesgetSimulationsVector<- function( numberOfSimulations, mean, stdDev ){ # Define variables simulations.vector<- rep(0, numberOfSimulations)
# Generate the exponential variable y1 y1<- runif(1) y1<- -log(y1) for( i in 1:numberOfSimulations ) { # Generate the independent exp. variable y2 y2<- runif(1) y2<- -log(y2) # Acceptance-Rejection procedure while( y2 < (y1-1)^2/2 ) { # if rejected, generate two new independent exp. variables y1 and # y2 and repeat procedure y1<- runif(1) y1<- -log(y1) y2<- runif(1) y2<- -log(y2) }
Generating Continuous Random Variables
y3<- y2-(y1-1)^2/2 # Generate a random number U and store the simulated variable y1 # in simulations.vector u<- runif(1) if( u <= 0.5 ) { simulations.vector[i]<- mean + (y1*stdDev) } else { simulations.vector[i]<- mean + (-y1*stdDev) } y1<- y3 } return(simulations.vector) };
####################################### Function End####################################
simulations.vector<- getSimulationsVector( 100000, 10, 3 )summary( simulations.vector )sd( simulations.vector )hist(simulations.vector, 250, xlab="",ylab="Simulations",main="Exercise: Acceptance-Rejection Method and Normal Distribution");
Exercise 2 – Solution 31
Executed simulations: 100,000
Estimated mean: 10.010
Estimated standard deviation: 2.993299
Generating Continuous Random Variables
Thank you for your attention.
32
Generating Continuous Random Variables
Appendix 33
Appendix
1 Backup Slides
2 Bibliography
Generating Continuous Random Variables
Poisson Process Introduction (1/2) 34
is the number of events in one period
is the number of subintervals is the probability that there is an event in one subinterval (Binominal distribution) Problem: there might be more than one event per subinterval
Solution: smaller subintervals this increases the amount of subintervals is required
= Poisson process having rate ()
Assumptions Number of events are independent Homogeneity: same probability for intervals of equal length
Generating Continuous Random Variables
Poisson Process Introduction (2/2) 35
Interarrival times = [ time of -th event ] – [ time of the ()-th event ] E.g., and means that the first event occurred at time 5 and the second event at time
15 , ,… are independent , ,… are identically distributed exponential random variables with rate
= time of the -th event
Probability of Density function of = Sum of independent exponential random variables, each having parameter = Gamma distribution
Generating Continuous Random Variables
Nonhomogeneous Poisson Process: Option 1 Improvement 36
Idea 1. Break up the interval into subintervals 2. Perform the thinning approach over each subinterval Hence “ throughout the interval” is more likely the thinning approach will be more efficient
General Algorithm 1. Determine appropriate values such that
(Equation 1) 2. Generate the nonhomogeneous Poisson process over the interval (, )
2.1 Generate exponential random variables with rate 2.2 Accept the generated event occurring at time , , with probability
Concrete algorithm (Generate the first time units of a nonhomogeneous Poisson process) 1. 2. Generate a random number and set 3. If , go to step 8 4. 5. Generate a random number 5. If , set 7. Go to step 2 8. If , stop 9. 10. Go to step 3
Generating Continuous Random Variables
Legend: = intensity function when Equation 1
is satisfied = present time = present interval = number of events so far = event times
Nonhomogeneous Poisson Process: Option 2 37
Option 2: Successive event times
Idea: 1. (Directly) Generate the event time
1.1 Generate the event time from the distribution – The simulated value is (; only differentiated, because “” refers to event times and “” refers to the value
of ) (e.g., simulate by using the Inverse Transform Method)
2. Use to generate 2.1 Generate a value from the distribution 2.2
3. Use to generate 2.1 Generate a value from the distribution 2.2
…
= distribution of the additional time until the next event = probability, that time from the current event to the next event is less than
Generating Continuous Random Variables
Appendix 38
Appendix
1 Backup Slides
2 Bibliography
Generating Continuous Random Variables
Bibliography 39
ETH Zürich. 2013. The Uniform Distribution, ETH Zürich, Zürich, CH (available online at http://stat.ethz.ch/R-manual/R-devel/library/stats/html/Uniform.html ).
Haugh, M. 2010. “Generating Random Variables and Stochastic Processes,” Columbia University, US (available online at http://www.columbia.edu/~mh2078/MCS_ Generate_ RVars .pdf ).
The R Core Team 2013. “R: A Language and Environment for Statistical Computing. Reference Index. Version 3.0.0 (2013-04-03),” (part of the R download).
R-Project 2013. “Probability Distributions,” (available online at http://cran.r-project.org/web/views/Distributions.html ).
Ross, S. M. 2013. Simulation, 5th ed., San Diego, CA: Academic Press. Sigman, K. 2009. “Inverse Transform Method,” Columbia University, US (available online at
http://www.columbia.edu/~ks20/4404-13-Spring/4404-Notes-ITM.pdf ). Valdez, E. A. 2008. “Generating Continuous Random Variables,” University of Connecticut,
US (available at http://www.math.uconn.edu/~valdez/math276s08/Math276-Week45.pdf ). Venables, W. N., Smith, D. M., and the R Core Team. 2013. “An Introduction to R. Notes on
R: A Programming Environment for Data Analysis and Graphics Version 3.0.0 (2013-04-03),” (part of the R download).
Generating Continuous Random Variables