Upload
hadley-wickham
View
769
Download
1
Tags:
Embed Size (px)
Citation preview
Hadley Wickham
Stat405Simulation
Thursday, 23 September 2010
1. Homework comments
2. Mathematical approach
3. More randomness
4. Random number generators
Thursday, 23 September 2010
Just graded your organisation and code, and focused my comments there.
Biggest overall tip: use floating figures (with \figure{...}) with captions. Use \ref{} to refer to the figure in the text.
Captions should start with brief description of plot (including bin width if applicable) and finish with brief description of what the plot reveals.
Will grade captions more aggressively in the future.
Homework
Thursday, 23 September 2010
Code
Gives explicit technical details.
Your comments should remind you why you did what you did.
Most readers will not look at it, but it’s very important to include it, because it means that others can check your work.
Thursday, 23 September 2010
Mathematical approach
Why are we doing this simulation? Could work out the expected value and variance mathematically. So let’s do it!
Simplifying assumption: slots are iid.
Thursday, 23 September 2010
calculate_prize <- function(windows) { payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0)
same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB"))
if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } else { cherries <- sum(windows == "C") diamonds <- sum(windows == "DD")
prize <- c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] } prize}
Thursday, 23 September 2010
slots <- read.csv("slots.csv", stringsAsFactors = F)
# Calculate empirical distributiondist <- table(c(slots$w1, slots$w2, slots$w3))dist <- dist / sum(dist)
slots <- names(dist)
Thursday, 23 September 2010
poss <- expand.grid( w1 = slots, w2 = slots, w3 = slots, stringsAsFactors = FALSE)
poss$prize <- NAfor(i in seq_len(nrow(poss))) { window <- as.character(poss[i, 1:3]) poss$prize[i] <- calculate_prize(window)}
Thursday, 23 September 2010
Your turn
How can you calculate the probability of each combination?
(Hint: think about subsetting. Another hint: think about the table and character subsetting. Final hint: you can do this in one line of code)
Then work out the expected value (the payoff).
Thursday, 23 September 2010
poss$prob <- with(poss, dist[w1] * dist[w2] * dist[w3])
(poss_mean <- with(poss, sum(prob * prize)))
# How do we determine the variance of this# estimator?
Thursday, 23 September 2010
More randomness
Thursday, 23 September 2010
Sample
Very useful for selecting from a discrete set (vector) of possibilities.
Four arguments: x, size, replace, prob
Thursday, 23 September 2010
How can you?
Choose 1 from vector
Choose n from vector, with replacement
Choose n from vector, without replacement
Perform a weighted sample
Put a vector in random order
Put a data frame in random order
Thursday, 23 September 2010
# Choose 1 from vectorsample(letters, 1)
# Choose n from vector, without replacementsample(letters, 10)sample(letters, 40)
# Choose n from vector, with replacementsample(letters, 40, replace = T)
# Perform a weighted samplesample(names(dist), prob = dist)
Thursday, 23 September 2010
# Put a vector in random ordersample(letters)
# Put a data frame in random orderslots[sample(1:nrow(slots)), ]
Thursday, 23 September 2010
Your turn
Source of randomness in random_prize is sample. Other options are:
runif, rbinom, rnbinom, rpois, rnorm, rt, rcauchy
What sort of random variables do they generate and what are their parameters? Practice generating numbers from them.
Thursday, 23 September 2010
Function Distribution Parameters
runif Uniform min, max
rbinom Binomial size, prob
rnbinom Negative binomial size, prob
rpois Poisson lambda
rnorm Normal mean, sd
rt t df
rcauchy Cauchy location, scale
Thursday, 23 September 2010
Distributions
Other functions
• r to generate random numbers
• d to compute density f(x)
• p to compute distribution F(x)
• q to compute inverse distribution F-1(x)
Thursday, 23 September 2010
# Easy to combine random variables
n <- rpois(10000, lambda = 10)x <- rbinom(10000, size = n, prob = 0.3)qplot(x, binwidth = 1)
p <- runif(10000)x <- rbinom(10000, size = 10, prob = p)qplot(x, binwidth = 0.1)
# cf.qplot(runif(10000), binwidth = 0.1)
Thursday, 23 September 2010
# Simulation is a powerful tool for exploring # distributions. Easy to do computationally; hard # to do analytically
qplot(1 / rpois(10000, lambda = 20))qplot(1 / runif(10000, min = 0.5, max = 2))
qplot(rnorm(10000) ^ 2)qplot(rnorm(10000) / rnorm(10000))
# http://www.johndcook.com/distribution_chart.html
Thursday, 23 September 2010
Your turn
Thursday, 23 September 2010
RNGComputers are deterministic, so how
do they produce randomness?
Thursday, 23 September 2010
Thursday, 23 September 2010
How do computers generate random numbers?
They don’t! Actually produce pseudo-random sequences.
Common approach: Xn+1 = (aXn + c) mod m
(http://en.wikipedia.org/wiki/Linear_congruential_generator)
Thursday, 23 September 2010
next_val <- function(x, a, c, m) { (a * x + c) %% m}
x <- 1001(x <- next_val(x, 1664525, 1013904223, 2^32))
# http://en.wikipedia.org/wiki/List_of_pseudorandom_number_generators
# R uses# http://en.wikipedia.org/wiki/Mersenne_twister
Thursday, 23 September 2010
# Random numbers are reproducible!
set.seed(1)runif(10)
set.seed(1)runif(10)
# Very useful when required to make a reproducible# example that involves randomness
Thursday, 23 September 2010
Atmospheric radio noise: http://www.random.org. Use from R with random package.
Not really important unless you’re running a lottery. (Otherwise by observing a long enough sequence you can predict the next value)
True randomness
Thursday, 23 September 2010