28
Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer [email protected] Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30 am

Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer [email protected] Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

Estimates and sample sizes Chapter 6

Prof. Felix Apfaltrer

[email protected]

Office:N763

Phone: 212-220 74 21

Office hours:

Tue, Thu 10am-11:30 am

Page 2: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

2

StatisticalMethods

DescriptiveStatistics

InferentialStatistics

EstimationHypothesis

Testing

Inferential Statistics

Page 3: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

3

• 1. Involves:– Estimation– Hypothesis Testing

• 2. Purpose– Make Decisions

about Population Characteristics

Population?Population?

Inferential Statistics

Page 4: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

4

PopulationPopulation

SampleSample

Sample Sample statistic statistic

((XX))

Estimates Estimates & tests& tests

Inference Process

Page 5: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

5

Estimation

PointEstimate Interval

Confidence

Point Estimates

Page 6: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

6

Estimate PopulationEstimate PopulationParameter...Parameter...

with Samplewith SampleStatisticStatistic

MeanMean xx

ProportionProportion pp pp̂̂

VarianceVariance 22 ss 22

DifferencesDifferences 11 - - 22 xx11 --xx22

Point Estimates 2

Page 7: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

7

Samples and estimation

Assumptions:

• Samples should be simple random samples

• Binomial distribution requirements satisfied

• Normal distribution approximation: ok.

Example (survey about photo-cop):

• 829 Minnesotans surveyed

• 51% opposed to cameras used for issuing traffic tickets

• Point estimate ^p=0.51

Notation:p = population proportion^ p = x/n sample proportion of x successes in a sample of size n^q = 1 - ^ p = sample proportion of failures in a sample of size n

Definition:

A point estimate is a single value used to estimate a population parameter.

Page 8: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

8

A study found the body temperatures of 106 healthy adults. The sample mean was 98.2 degrees and the sample standard deviation was 0.62 degrees. Find the point estimate of the population mean of all body temperatures.

Because the sample mean x is the best point estimate of the population mean , we conclude that the best point estimate of the population mean of all body temperatures is 98.20o F.

X

Example Point Estimate

Page 9: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

9

Estimation

PointEstimate

IntervalConfidence

Confidence Intervals

Page 10: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

10

Confidence Confidence intervalinterval

Confidence Confidence limitlimit (upper) (upper)

Sample statistic Sample statistic

(point estimate)(point estimate)

Confidence Confidence limitlimit (lower) (lower)

Definition: A confidence interval is a range (or an interval) of values used to estimate the true value of the population parameter. The confidence level gives us the success rate of the procedure used to construct the confidence interval.

Definition: A confidence interval is a range (or an interval) of values used to estimate the true value of the population parameter. The confidence level gives us the success rate of the procedure used to construct the confidence interval.

Confidence Interval Definition

Page 11: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

11

Confidence intervals (cont)

Definition:

A confidence interval (CI) is a range of values used to estimate the true value of a population parameter.

Example (survey about photo-cop):• 829 Minnesotans surveyed• 51% opposed to cameras used for issuing traffic tickets• The 0.95 confidence interval estimate of the population

proportion p against photo-cop is0.476<p<0.544

Confidence level associated to confidence interval. = success rate for estimate to be in interval• given as probability 1- • For example: confidence level of 0.95, =0.05

confidence level of 0.99, =0.01

Page 12: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

12

Confidence intervals(cont)Definition:A critical values are the numbers

-z/2 and z/2 that separate the areas /2 on the left and right tails from the center area 1- .

Example (survey about photo-cop):• For confidence level of 0.95, =0.05, so = 0.025

• In table A-2, -z/2= - 1.96• and z/2 =1.96

short visit from Navy.

• Calculate P(X>=308 days).

• What does this suggest?

• Premature if below 4%. Find length.

Q11 SAT =998 = 202

College requires 1100 minimum.

• Find percentage satisfying requirement.

• Find 40% percentile. Why does college

• not ask: top 40%?

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1-

/2 /2

Confidence level z/2

90% 0.10 1.645

95% 0.05 1.96

99% 0.01 2.575

Page 13: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

13

The confidence level is often expressed as probability 1 - , where is the complement of the confidence level. For a 0.95(95%) confidence level, = 0.05. For a 0.99(99%) confidence level, = 0.01.

Level of Confidence

Margin of Erroris the maximum likely difference observed between sample mean x and population mean µ,

and is denoted by E.

E = z/2 •n

Page 14: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

14

Confidence Interval (or Interval Estimate) for

Population Mean µ when is known

x – E < µ < x + E

(x – E, x + E)

x + E

Page 15: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

15

Procedure for Constructing a Confidence Interval for µ

when is known1. Verify that the required assumptions are met.

2. Find the critical value z2 that corresponds to the desired degree of confidence.

3. Evaluate the margin of error E = z2 • / n .

x – E < µ < x + E

4. Find the values of x – E and x + E. Substitute thosevalues in the general format of the confidence interval:

Page 16: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

16

Round-Off Rule for Confidence Intervals Used to Estimate µ

1. When using the original set of data, round the confidence interval limits to one more decimal place than used in original set of data.

2. When the original set of data is unknown and only the summary statistics (n,x,s) are used, round the confidence interval limits to the same number of decimal places used for the sample mean.

Page 17: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

17

n = 106

x = 98.20o

s = 0.62o

= 0.05/2 = 0.025

z / 2 = 1.96

E = z / 2 • = 1.96 • 0.62 = 0.12n 106

98.08o < < 98.32o

Example: A study found the body temperatures of 106 healthy adults. The sample mean was 98.2 degrees and the sample standard deviation was 0.62 degrees. Find the margin of error E and the 95% confidence interval for µ.

x – E < < x + E

98.20o – 0.12 < < 98.20o + 0.12

Page 18: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

18

(z/2) n =

E

2

Sample Size for Estimating Mean

When finding the sample size n, if the Formula above does not result in a whole number, always increase the value of n to the next larger whole number.

If is unknown: a) range/4, or b) calculate the sample standard deviation s and use it in place of , orc) Estimate the value of by using the results of some other study that was done earlier.

If is unknown: a) range/4, or b) calculate the sample standard deviation s and use it in place of , orc) Estimate the value of by using the results of some other study that was done earlier.

Page 19: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

19

= 0.05/2 = 0.025

z / 2 = 1.96

E = 2 = 15

n = 1.96 • 15 = 216.09 = 217 2

2

With a simple random sample of only 217 statistics professors, we will be 95% confident that the sample mean will be within 2 points of the true population mean .

Example: Assume that we want to estimate the mean IQ score for the population of statistics professors. How many statistics professors must be randomly selected for IQ tests if we want 95% confidence that the sample mean is within 2 IQ points of the population mean? Assume that = 15, as is found in the general population.

Example: Assume that we want to estimate the mean IQ score for the population of statistics professors. How many statistics professors must be randomly selected for IQ tests if we want 95% confidence that the sample mean is within 2 IQ points of the population mean? Assume that = 15, as is found in the general population.

Page 20: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

20

90% Samples90% Samples

95% Samples95% Samples

99% Samples99% Samples

+1.65+1.65x x +2.58+2.58xx

XX

+1.96+1.96xx

-2.58-2.58xx -1.65-1.65xx

-1.96-1.96xx

XX= = ± Z ± Zxx

Many Samples Have Same Interval

Page 21: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

21

Not KnownAssumptions

Use Student t distribution1. The Student t distribution is different for different sample sizes (see Figure for the cases n = 3 and n = 12).2. The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples.3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0).4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a σ= 1).5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution.

1) The sample is a simple random sample.2) Either the sample is from a normally distributed population, or n > 30.

Page 22: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

Felix Apfaltrer

If the distribution of a population is essentially normal, then the distribution of

t =x - µ

s

n

Degrees of Freedom (df ) corresponds to the number of sample values that can vary after certain restrictions have been imposed on all data values

is essentially a Student t Distribution for all samples of size n, and is used to find critical values denoted by t/2.

df = n – 1

in this section.

Page 23: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

23

x – E < µ < x + E

t/2 found in Table A-3

where E = t/2 ns

x – E < µ < x + E

t/2 found in Table A-3

Confidence Interval for the Estimate of E

Based on an Unknown and a Small Simple Random Sample from a Normally Distributed Population

E is the margin of Error and tα/2 hasn-1 degrees of freedom.

Page 24: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

24

n = 106

x = 98.20o

s = 0.62o

= 0.05/2 = 0.025

t / 2 = 1.96

E = t / 2 • s = 1.984 • 0.62 = 0.1195n 106

98.08o < < 98.32o

x – E < < x + E98.20o – 0.1195 < < 98.20o + 0.1195

Example: A study found the body temperatures of 106 healthy adults. The sample mean was 98.2 degrees and the sample standard deviation was 0.62 degrees. Find the margin of error E and the 95% confidence interval for µ.

Based on the sample provided, the confidence interval for the population mean is 98.08o < < 98.32o. The interval is the same here as in above

example, but in some other cases, the difference would be much greater.

Page 25: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

25

Important Properties of the Student t Distribution

1. The Student t distribution is different for different sample sizes (see Figure 6-5 for the cases n = 3 and n = 12).

2. The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples.

3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0).

4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a = 1).

5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution.

Page 26: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

26

Using the z Normal and t Distribution

Page 27: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

27

Example: Data Set 14 in Appendix B includes the Flesch ease of reading scores for 12 different pages randomly selected from J.K. Rowling’s Harry Potter and the Sorcerer’s Stone. Find the 95% interval estimate of , the mean Flesch ease of reading score. (The 12 pages’ distribution appears to be bell-shaped.)

x = 80.75

s = 4.68

= 0.05/2 = 0.025

t/2 = 2.20180.75 – 2.97355 < µ < 80.75 + 2.97355

77.77645 < < 83.7235577.78 < < 83.72

x – E < µ < x + E

E = t2 s = (2.201)(4.68) = 2.97355

n 12

Page 28: Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N763 Phone: 212-220 74 21 Office hours: Tue, Thu 10am-11:30

28

Finding the Point Estimate and E from a

Confidence Interval

Margin of Error:

E = (upper confidence limit) – (lower confidence limit)

2

Point estimate of µ: x = (upper confidence limit) + (lower confidence limit)

2