27
Hadley Wickham Stat405 Simulation Thursday, 23 September 2010

10 simulation

Embed Size (px)

Citation preview

Page 1: 10 simulation

Hadley Wickham

Stat405Simulation

Thursday, 23 September 2010

Page 2: 10 simulation

1. Homework comments

2. Mathematical approach

3. More randomness

4. Random number generators

Thursday, 23 September 2010

Page 3: 10 simulation

Just graded your organisation and code, and focused my comments there.

Biggest overall tip: use floating figures (with \figure{...}) with captions. Use \ref{} to refer to the figure in the text.

Captions should start with brief description of plot (including bin width if applicable) and finish with brief description of what the plot reveals.

Will grade captions more aggressively in the future.

Homework

Thursday, 23 September 2010

Page 4: 10 simulation

Code

Gives explicit technical details.

Your comments should remind you why you did what you did.

Most readers will not look at it, but it’s very important to include it, because it means that others can check your work.

Thursday, 23 September 2010

Page 5: 10 simulation

Mathematical approach

Why are we doing this simulation? Could work out the expected value and variance mathematically. So let’s do it!

Simplifying assumption: slots are iid.

Thursday, 23 September 2010

Page 6: 10 simulation

calculate_prize <- function(windows) { payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0)

same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB"))

if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } else { cherries <- sum(windows == "C") diamonds <- sum(windows == "DD")

prize <- c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] } prize}

Thursday, 23 September 2010

Page 7: 10 simulation

slots <- read.csv("slots.csv", stringsAsFactors = F)

# Calculate empirical distributiondist <- table(c(slots$w1, slots$w2, slots$w3))dist <- dist / sum(dist)

slots <- names(dist)

Thursday, 23 September 2010

Page 8: 10 simulation

poss <- expand.grid( w1 = slots, w2 = slots, w3 = slots, stringsAsFactors = FALSE)

poss$prize <- NAfor(i in seq_len(nrow(poss))) { window <- as.character(poss[i, 1:3]) poss$prize[i] <- calculate_prize(window)}

Thursday, 23 September 2010

Page 9: 10 simulation

Your turn

How can you calculate the probability of each combination?

(Hint: think about subsetting. Another hint: think about the table and character subsetting. Final hint: you can do this in one line of code)

Then work out the expected value (the payoff).

Thursday, 23 September 2010

Page 10: 10 simulation

poss$prob <- with(poss, dist[w1] * dist[w2] * dist[w3])

(poss_mean <- with(poss, sum(prob * prize)))

# How do we determine the variance of this# estimator?

Thursday, 23 September 2010

Page 11: 10 simulation

More randomness

Thursday, 23 September 2010

Page 12: 10 simulation

Sample

Very useful for selecting from a discrete set (vector) of possibilities.

Four arguments: x, size, replace, prob

Thursday, 23 September 2010

Page 13: 10 simulation

How can you?

Choose 1 from vector

Choose n from vector, with replacement

Choose n from vector, without replacement

Perform a weighted sample

Put a vector in random order

Put a data frame in random order

Thursday, 23 September 2010

Page 14: 10 simulation

# Choose 1 from vectorsample(letters, 1)

# Choose n from vector, without replacementsample(letters, 10)sample(letters, 40)

# Choose n from vector, with replacementsample(letters, 40, replace = T)

# Perform a weighted samplesample(names(dist), prob = dist)

Thursday, 23 September 2010

Page 15: 10 simulation

# Put a vector in random ordersample(letters)

# Put a data frame in random orderslots[sample(1:nrow(slots)), ]

Thursday, 23 September 2010

Page 16: 10 simulation

Your turn

Source of randomness in random_prize is sample. Other options are:

runif, rbinom, rnbinom, rpois, rnorm, rt, rcauchy

What sort of random variables do they generate and what are their parameters? Practice generating numbers from them.

Thursday, 23 September 2010

Page 17: 10 simulation

Function Distribution Parameters

runif Uniform min, max

rbinom Binomial size, prob

rnbinom Negative binomial size, prob

rpois Poisson lambda

rnorm Normal mean, sd

rt t df

rcauchy Cauchy location, scale

Thursday, 23 September 2010

Page 18: 10 simulation

Distributions

Other functions

• r to generate random numbers

• d to compute density f(x)

• p to compute distribution F(x)

• q to compute inverse distribution F-1(x)

Thursday, 23 September 2010

Page 19: 10 simulation

# Easy to combine random variables

n <- rpois(10000, lambda = 10)x <- rbinom(10000, size = n, prob = 0.3)qplot(x, binwidth = 1)

p <- runif(10000)x <- rbinom(10000, size = 10, prob = p)qplot(x, binwidth = 0.1)

# cf.qplot(runif(10000), binwidth = 0.1)

Thursday, 23 September 2010

Page 20: 10 simulation

# Simulation is a powerful tool for exploring # distributions. Easy to do computationally; hard # to do analytically

qplot(1 / rpois(10000, lambda = 20))qplot(1 / runif(10000, min = 0.5, max = 2))

qplot(rnorm(10000) ^ 2)qplot(rnorm(10000) / rnorm(10000))

# http://www.johndcook.com/distribution_chart.html

Thursday, 23 September 2010

Page 21: 10 simulation

Your turn

Thursday, 23 September 2010

Page 22: 10 simulation

RNGComputers are deterministic, so how

do they produce randomness?

Thursday, 23 September 2010

Page 23: 10 simulation

Thursday, 23 September 2010

Page 24: 10 simulation

How do computers generate random numbers?

They don’t! Actually produce pseudo-random sequences.

Common approach: Xn+1 = (aXn + c) mod m

(http://en.wikipedia.org/wiki/Linear_congruential_generator)

Thursday, 23 September 2010

Page 25: 10 simulation

next_val <- function(x, a, c, m) { (a * x + c) %% m}

x <- 1001(x <- next_val(x, 1664525, 1013904223, 2^32))

# http://en.wikipedia.org/wiki/List_of_pseudorandom_number_generators

# R uses# http://en.wikipedia.org/wiki/Mersenne_twister

Thursday, 23 September 2010

Page 26: 10 simulation

# Random numbers are reproducible!

set.seed(1)runif(10)

set.seed(1)runif(10)

# Very useful when required to make a reproducible# example that involves randomness

Thursday, 23 September 2010

Page 27: 10 simulation

Atmospheric radio noise: http://www.random.org. Use from R with random package.

Not really important unless you’re running a lottery. (Otherwise by observing a long enough sequence you can predict the next value)

True randomness

Thursday, 23 September 2010