15
Samples and populations Estimating with uncertainty Review - order of operations s 2 = n n " 1 # $ % & ( Y i 2 i=1 n ) n " Y 2 # $ % % % % & ( ( ( ( Review - order of operations 1. Parentheses 2. Exponents and roots 3. Multiply and divide 4. Add and subtract Review - order of operations s 2 = n n " 1 # $ % & ( Y i 2 i=1 n ) n " Y 2 # $ % % % % & ( ( ( (

Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Samples and populations

Estimating with uncertainty

Review - order of operations

!

s2

=n

n "1

#

$ %

&

' (

Yi

2

i=1

n

)

n"Y

2

#

$

% % % %

&

'

( ( ( (

Review - order of operations

1. Parentheses

2. Exponents and roots

3. Multiply and divide

4. Add and subtract

Review - order of operations

!

s2

=n

n "1

#

$ %

&

' (

Yi

2

i=1

n

)

n"Y

2

#

$

% % % %

&

'

( ( ( (

Page 2: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Review - order of operations

!

s2

=n

n "1

#

$ %

&

' (

Yi

2

i=1

n

)

n"Y

2

#

$

% % % %

&

'

( ( ( (

Review - types of variables

• Categorical variables

– For example, country of birth

• Numerical variables

– For example, student height

Review - types of variables

• Categorical variables

• Numerical variables

Discrete

Continuous

Review - types of variables

• Categorical variables

• Numerical variables

Discrete

Continuous

Nominal

Ordinal

Page 3: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Review - types of variables

• Categorical variables

– Nominal - no natural order

– Ordinal - can be placed in an order

Review - types of variables

• Categorical variables

– Nominal - no natural order

• Example - country of birth

– Ordinal - can be placed in an order

Review - types of variables

• Categorical variables

– Nominal - no natural order

• Example - country of birth

– Ordinal - can be placed in an order

• Example - educational experience

– Some high school, high school diploma, some college,

college degree, masters degree, PhD

Sampling from a population

• We often sample from a population

• Consider random samples

– Each individual has an equal and identical

probability of being selected

Page 4: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Body mass of 400 humans

Random sample of 10 people

Page 5: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Population mean:µ = 70.8 kg

Population mean:µ = 70.8 kg

Sample mean:x = 76.7 kg

Another sample…

Page 6: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Population mean:µ = 70.8 kg

Sample mean:x = 69.2 kg

What if we do this many times?

Example: gene length

n = 20,290

Page 7: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

n = 20,290µ = 2622.0! = 2037.9

Sample histogram

n = 100Y = 2675.4s = 1539.2

Y = 2675.4s = 1539.2

Y = 2588.8s = 1620.5

Y = 2702.4s = 1727.1

Y = 2767.2s = 2044.7

Page 8: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Y = 2675.4s = 1539.2

Y = 2588.8s = 1620.5

Y = 2702.4s = 1727.1

Y = 2767.2s = 2044.7

Sampling distribution of the mean

1000 samples

Sampling distribution of the mean

Sampling distribution of the mean

Sampling distribution of the mean

Page 9: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

µ = 2622.0

Mean of means:2626.4

Sampling distribution of the mean

Y = 2675.4s = 1539.2

Y = 2588.8s = 1620.5

Y = 2702.4s = 1727.1

Y = 2767.2s = 2044.7

s = 1539.2

s = 1620.5s = 1727.1

s = 2044.7

Sampling distribution of the standard deviation Sampling distribution of the standard deviation

Page 10: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

100 samplesPopulation ! = 2036.9

Mean sample s = 1962.6

Sampling distribution of the standard deviation

1000 samplesPopulation ! = 2036.9

Mean sample s = 1929.7

Sampling distribution of the standard deviation

Sampling distribution of the mean, n=10

Sampling distribution of the mean, n=100

Sampling distribution of the mean, n = 1000

Sampling distribution of the mean, n=10

Sampling distribution of the mean, n=100

Sampling distribution of the mean, n = 1000

Page 11: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Precise Imprecise

Biased

Unbiased

Precise Imprecise

Biased

Unbiased

Larger sample size

Group activity #2

• Form groups of size 2-5

• Get out a blank sheet of paper

• Write everyone’s full name on the paper

Page 12: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

How many toes do aliens have?

Instructions

• You have measurements from a population of400 aliens

• Use your random number table to select a sampleof ten measurements

• Calculate your sample mean and, if you have acalculator or a large brain, your sample standarddeviation

• On your paper, answer the following:

1. What was your sample mean and standard deviation?

2. How did you randomly choose your sample?

Page 13: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Distribution of the sample mean

• No matter what the frequency distribution

of the population:

• The sample mean has an approximately

bell-shaped (normal) distribution

• Especially for large n (large samples)

How precise is any one estimated

sample mean?

The standard error of anestimate is the standarddeviation of its samplingdistribution. The standard

error predicts the samplingerror of the estimate.

Standard error of the mean

!

" µ ="

n

Page 14: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Estimate of the standard error of

the mean

!

SEY

=s

n

Confidence interval

• Confidence interval

– a range of values surrounding the sample

estimate that is likely to contain the population

parameter

• 95% confidence interval

– plausible range for a parameter based on the

data

The 2SE rule-of-thumb

The interval from Y ! 2Y

SE to Y + 2Y

SE

provides a rough estimate of the 95% confidence interval

for the mean.

Confidence interval

Page 15: Y i 2 n 2 i 1 Y 1 Review - order of operations …bio300/notes/03Samples4x.pdf100 samples Population ! = 2036.9 Mean sample s = 1962.6 Sampling distribution of the standardpdeviation

Pseudoreplication

The error that occurs when samples are notindependent, but they are treated as though they are.

Example: “The transylvania effect”

A study of 130,000 calls for police assistancein 1980 found that they were more likely thanchance to occur during a full moon.

Example: “The transylvania effect”

A study of 130,000 calls for police assistancein 1980 found that they were more likely thanchance to occur during a full moon.

Problem: There may have been 130,000calls in the data set, but there were only 13full moons in 1980. These data are notindependent.