28
Hadley Wickham Stat310 Sampling distributions Monday, 22 March 2010

17 Sampling Dist

Embed Size (px)

Citation preview

Page 1: 17 Sampling Dist

Hadley Wickham

Stat310Sampling distributions

Monday, 22 March 2010

Page 2: 17 Sampling Dist

Quiz

• Pick up quiz on your way in

• Start at 1pm

• Finish at 1:10pm

• Closed book

Monday, 22 March 2010

Page 3: 17 Sampling Dist

http://xkcd.com/715/Monday, 22 March 2010

Page 4: 17 Sampling Dist

1. Quiz

2. CLT & approximations

3. Sampling distributions

4. Example

5. More theory

Monday, 22 March 2010

Page 5: 17 Sampling Dist

CLT

Central limit theorem.

The distribution of a mean is normal when gets big.

Monday, 22 March 2010

Page 6: 17 Sampling Dist

Approximation

This implies that if n is big then ...

Monday, 22 March 2010

Page 7: 17 Sampling Dist

Sampling distributions

Monday, 22 March 2010

Page 8: 17 Sampling Dist

Random experiment“A random experiment is an experiment, trial, or observation that can be repeated numerous times under the same conditions... It must in no way be affected by any previous outcome and cannot be predicted with certainty.” (http://cnx.org/content/m13470/latest/)

i.e. it is uncertain (we don’t know ahead of time what the answer will be) and repeatable (ideally).

Monday, 22 March 2010

Page 9: 17 Sampling Dist

Where we are

Univariate random variables: an experiment with one output

Bivariate random variables: an experiment with two outputs

Sequences of random variables:An experiment performed repeatedly.Repeatable = i.i.d

Monday, 22 March 2010

Page 10: 17 Sampling Dist

A sampling distribution:Summary statistics from a repeated experiment

Monday, 22 March 2010

Page 11: 17 Sampling Dist

Definitions

Sample = results of n random experiments.

Random sample = result of a random experimented repeated n times. Therefore, they’re iid.

Both are sequences of random variables.

Statistic = A function of random variables with no unknown parameters.

Monday, 22 March 2010

Page 12: 17 Sampling Dist

Example

Spin a bottle and record the angle in degrees in which it points. Repeat.

How would you write this mathematically?

Monday, 22 March 2010

Page 13: 17 Sampling Dist

First time

x1 = 205, x2 = 256, x3 = 86, x4 = 119, x5 = 16, x6 = 278, x7 = 55, x8 = 16, x9 = 295, x10 = 341, x11 = 299, x12 = 270,x13 = 118, x14 = 360, x15 = 97, x16 = 282, x17 = 42, x18 = 283, x19 = 259, x20 = 326

Monday, 22 March 2010

Page 14: 17 Sampling Dist

Second time

x1 = 184, x2 = 344, x3 = 118, x4 = 226, x5 = 208, x6 = 106, x7 = 332, x8 = 310, x9 = 339, x10 = 95, x11 = 7, x12 = 274, x13 = 120, x14 = 346, x15 = 211, x16 = 166, x17 = 84, x18 = 102, x19 = 32, x20 = 128

Monday, 22 March 2010

Page 15: 17 Sampling Dist

Value

Experim

ent

5

10

15

20

● ● ●●●● ●● ● ●●● ● ●● ● ●● ●●

●● ●● ●●●● ● ●● ●● ●● ●● ●● ●

● ●● ●● ●●● ●● ●●● ●● ●●●●●

●● ●● ●●● ● ●● ●● ●● ●● ●● ●●

● ● ●●● ●●● ●● ● ●● ●● ●● ● ●●

●● ● ● ●●● ● ●●●● ●●● ●● ●●●

●●● ● ●●●● ●● ●● ●● ● ● ●● ●●

●● ●●● ●●● ● ● ● ●●● ●● ● ●●●

●● ●●● ●● ● ●● ● ●● ●● ●● ●● ●

● ●● ● ● ● ● ●● ●●● ●● ● ●●● ●●

●● ● ●● ●●● ●● ●● ●● ●●● ●●●

● ●●● ●● ● ●●● ● ●● ●● ●●● ●●

● ● ●● ●●● ●●● ● ●●● ●● ●● ● ●

● ●●● ● ●● ●●● ● ● ●●●●● ●●●

●● ●● ●● ●● ●●●● ● ● ●● ●● ●●

●● ●● ●●●●●● ●● ● ●● ● ● ●●●

●● ●● ●● ●●● ● ●● ● ●● ●● ● ●●

●●● ●● ● ● ●●● ●●● ●●● ● ●● ●

● ●● ● ●●● ●● ●● ● ●● ●● ●● ●●

● ● ●●●●● ●● ● ●● ● ● ●● ●● ●●

50 100 150 200 250 300 350

Monday, 22 March 2010

Page 16: 17 Sampling Dist

Value

Experim

ent

5

10

15

20

● ● ●●●● ●● ● ●●● ● ●● ● ●● ●●

●● ●● ●●●● ● ●● ●● ●● ●● ●● ●

● ●● ●● ●●● ●● ●●● ●● ●●●●●

●● ●● ●●● ● ●● ●● ●● ●● ●● ●●

● ● ●●● ●●● ●● ● ●● ●● ●● ● ●●

●● ● ● ●●● ● ●●●● ●●● ●● ●●●

●●● ● ●●●● ●● ●● ●● ● ● ●● ●●

●● ●●● ●●● ● ● ● ●●● ●● ● ●●●

●● ●●● ●● ● ●● ● ●● ●● ●● ●● ●

● ●● ● ● ● ● ●● ●●● ●● ● ●●● ●●

●● ● ●● ●●● ●● ●● ●● ●●● ●●●

● ●●● ●● ● ●●● ● ●● ●● ●●● ●●

● ● ●● ●●● ●●● ● ●●● ●● ●● ● ●

● ●●● ● ●● ●●● ● ● ●●●●● ●●●

●● ●● ●● ●● ●●●● ● ● ●● ●● ●●

●● ●● ●●●●●● ●● ● ●● ● ● ●●●

●● ●● ●● ●●● ● ●● ● ●● ●● ● ●●

●●● ●● ● ● ●●● ●●● ●●● ● ●● ●

● ●● ● ●●● ●● ●● ● ●● ●● ●● ●●

● ● ●●●●● ●● ● ●● ● ● ●● ●● ●●

50 100 150 200 250 300 350

Monday, 22 March 2010

Page 17: 17 Sampling Dist

samp

count

0

1

2

3

4

140 160 180 200

Monday, 22 March 2010

Page 18: 17 Sampling Dist

V1

count

0

2000

4000

6000

8000

100 150 200 250

Monday, 22 March 2010

Page 19: 17 Sampling Dist

V1

count

0

2000

4000

6000

8000

100 150 200 250

What will happen as I vary the number of samples I average over? (What theorem applies here?)

Monday, 22 March 2010

Page 20: 17 Sampling Dist

mean

count 0

100

200

300

400

0

100

200

300

400

1

4

0 50 100 150 200 250 300 350

2

5

0 50 100 150 200 250 300 350

3

0 50 100 150 200 250 300 350

Monday, 22 March 2010

Page 21: 17 Sampling Dist

mean

coun

t 0

1000

2000

3000

4000

0

1000

2000

3000

4000

1

100

0 50 100 150 200 250 300 350

10

1000

0 50 100 150 200 250 300 350

Monday, 22 March 2010

Page 22: 17 Sampling Dist

mean

coun

t 0

1000

2000

3000

4000

0

1000

2000

3000

4000

1

100

0 50 100 150 200 250 300 350

10

1000

0 50 100 150 200 250 300 350

How can I transform this random variable to make it comparable? (What theorem applies here?)

Monday, 22 March 2010

Page 23: 17 Sampling Dist

(mean − 180) * sqrt(n)

coun

t

0

200

400

600

800

0

200

400

600

800

0

200

400

600

800

1

5

1000

−400−200 0 200 400

2

10

10000

−400−200 0 200 400

3

20

−400−200 0 200 400

4

100

−400−200 0 200 400

Monday, 22 March 2010

Page 24: 17 Sampling Dist

sqrt(var)

count 0

200

400

600

800

1000

0

200

400

600

800

1000

2

4

0 50 100 150 200 250

3

5

0 50 100 150 200 250

We can do the same thing for other statistics...

Monday, 22 March 2010

Page 25: 17 Sampling Dist

sqrt(var)

coun

t

0100200300400500600

0

200

400

600

0

200

400

600

800

2

0 50 100 150 200 250 5

50 100 150 100

90 95 100 105 110 115 120

0100200300400500

0

200

400

600

800

0200400600800

1000

3

0 50 100 150 200 10

40 60 80 100 120 140 160 1000

98 100 102 104 106 108 110

0100200300400500600700

0200400600800

1000

0

200

400

600

800

4

0 50 100 150 20

60 80 100 120 14010000

102.5103.0103.5104.0104.5105.0105.5

Monday, 22 March 2010

Page 26: 17 Sampling Dist

Theory

We’ll start with the mean of normally distributed random variables, then try to extend in various ways.

Monday, 22 March 2010

Page 27: 17 Sampling Dist

Your turnX1, X2, ... are iid N(μ, σ2)

Find their mgfs. What do you notice?

Hint:

Sn =n�

1

Xi X̄n =Sn

n

MX(t) = exp�µt + σ2t2

Monday, 22 March 2010

Page 28: 17 Sampling Dist

Reading

4.2, 4.2.1

4.2.2, 4.4

Monday, 22 March 2010