55
MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc.

MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Embed Size (px)

Citation preview

Page 1: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

MM 207

Unit #5

The Normal Distribution

Slide 1.1- 1Copyright © 2009 Pearson Education, Inc.

Page 2: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

WHAT IS NORMAL?Section 5.1

Slide 1.1- 2Copyright © 2009 Pearson Education, Inc.

Page 3: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 3Copyright © 2009 Pearson Education, Inc.

Suppose a friend is pregnant and due to give birth on June 30. Would you advise her to schedule an important business meeting for June 16, two weeks before the due date?

Figure 5.1

Figure 5.1 is a histogram for a distribution of 300 natural births. The left vertical axis shows the number of births for each 4-day bin. The right vertical axis shows relative

frequencies.

Page 4: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 4Copyright © 2009 Pearson Education, Inc.

These bins have a total relative frequency of about 0.21, which says that about 21% of the births in this data set occurred more than 14 days before the due date.

We can find the proportion of births that occurred more than 14 days before the due date by adding the relative frequencies for the bins to the left of -14.

Figure 5.1

Page 5: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 5Copyright © 2009 Pearson Education, Inc.

The Normal Shape

The distribution of the birth data has a fairly distinctive shape, which is easier to see if we overlay the histogram with a smooth curve (Figure 5.2).

Page 6: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 6Copyright © 2009 Pearson Education, Inc.

• The distribution is single-peaked. Its mode, or most common birth date, is the due date.

• The distribution is symmetric around its single peak; therefore, its median and mean are the same as its mode. The median is the due date because equal numbers of births occur before and after this date. The mean is also the due date because, for every birth before the due date, there is a birth the same number of days after the due date.

• The distribution is spread out in a way that makes it resemble the shape of a bell, so we call it a “bell-shaped” distribution.

For our present purposes, the shape of this smooth distribution has three very important characteristics:

Page 7: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 7Copyright © 2009 Pearson Education, Inc.

Figure 5.3 Both distributions are normal and have the same mean of 75, but the distribution on the left has a larger standard deviation.

Page 8: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 8Copyright © 2009 Pearson Education, Inc.

Definition

The normal distribution is a symmetric, bell-shaped distribution with a single peak. Its peak corresponds to the mean, median, and mode of the distribution. Its variation can be characterized by the standard deviation of the distribution.

Page 9: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 9Copyright © 2009 Pearson Education, Inc.

The Normal Distribution and Relative Frequencies

Relative Frequencies and the Normal Distribution

• The area that lies under the normal distribution curve corresponding to a range of values on the horizontal axis is the relative frequency of those values.

• Because the total relative frequency must be 1, the total area under the normal distribution curve must equal 1, or 100%.

Page 10: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 10Copyright © 2009 Pearson Education, Inc.

Figure 5.5 The percentage of the total area in any region under the normal curve tells us the relative frequency of data values in that region.

Page 11: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 11Copyright © 2009 Pearson Education, Inc.

Look again at the normal distribution in Figure 5.5 (slide 5.1-11).

a. Estimate the percentage of births occurring between 0 and 60 days after the due date.

Solution:a. About half of the total area under the curve lies in the region

between 0 days and 60 days. This means that about 50% of the births in the sample occur between 0 and 60 days afterthe due date.

EXAMPLE 2 Estimating Areas

Page 12: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 12Copyright © 2009 Pearson Education, Inc.

Look again at the normal distribution in Figure 5.5 (slide 5.1-11).

b. Estimate the percentage of births occurring between 14 days before and 14 days after the due date.

Solution:b. Figure 5.5 shows that about 18% of the births occur more

than 14 days before the due date. Because the distribution is symmetric, about 18% must also occur more than 14 days after the due date. Therefore, a total of about 18% 18% 36% of births occur either more than 14 days before or more than 14 days after the due date. The question asked about the remaining region, which means between 14 days before and 14 days after the due date, so this region must represent 100% - 36% = 64% of the births.

EXAMPLE 2 Estimating Areas

Page 13: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 13Copyright © 2009 Pearson Education, Inc.

When Can We Expect a Normal Distribution?

Conditions for a Normal DistributionA data set that satisfies the following four criteria is likely

1. Most data values are clustered near the mean, giving the distribution a well-defined single peak.

2. Data values are spread evenly around the mean, making the distribution symmetric.

3. Larger deviations from the mean become increasingly rare, producing the tapering tails of the distribution.

4. Individual data values result from a combination of many different factors, such as genetic and environmental factors.

to have a nearly normal distribution:

Page 14: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 14Copyright © 2009 Pearson Education, Inc.

Which of the following variables would you expect to have a normal or nearly normal distribution?

a. Scores on a very easy test

Solution:a. Tests have a maximum possible score (100%) that limits the size of data values. If the test is easy, the mean will be high and many scores will be close to the maximum possible. The few lower scores may be spread out well below the mean. We therefore expect the distribution of scores to be left-skewed and non-normal.

EXAMPLE 3 Is It a Normal Distribution?

Page 15: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.1- 15Copyright © 2009 Pearson Education, Inc.

Which of the following variables would you expect to have a normal or nearly normal distribution?

b. Heights of a random sample of adult women

Solution:b. Height is determined by a combination of many factors (the genetic makeup of both parents and possibly environmental or nutritional factors). We expect the mean height for the sample to be close to the mode (most common height). We also expect there to be roughly equal numbers of women above and below the mean, and extremely large and small heights should be rare. That is why height is nearly normally distributed.

EXAMPLE 3 Is It a Normal Distribution?

Page 16: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

PROPERTIES OF THE NORMAL DISTRIBUTION

Section 5.2

Slide 1.1- 16Copyright © 2009 Pearson Education, Inc.

Page 17: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 17Copyright © 2009 Pearson Education, Inc.

Consider a Consumer Reports survey in which participants were asked how long they owned their last TV set before they replaced it. The variable of interest in this survey is replacement time for television sets.

Based on the survey, the distribution of replacement times has a mean of about 8.2 years, which we denote as (the Greek letter mu).

The standard deviation of the distribution is about 1.1 years, which we denote as (the Greek letter sigma).

Page 18: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 18Copyright © 2009 Pearson Education, Inc.

Making the reasonable assumption that the distribution of TV replacement times is approximately normal, we can picture it as shown in Figure 5.16.

Figure 5.16 Normal distribution for replacement times for TV sets with a mean of m 8.2 years and a standard deviation of s 1.1 years.

Page 19: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 19Copyright © 2009 Pearson Education, Inc.

A simple rule, called the 68-95-99.7 rule, gives precise guidelines for the percentage of data values that lie within 1, 2, and 3 standard deviations of the mean for any normal distribution.

Figure 5.17 Normal distribution illustrating the 68-95-99.7 rule.

Page 20: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 20Copyright © 2009 Pearson Education, Inc.

• About 68% (more precisely, 68.3%), or just over two-thirds, of the data points fall within 1 standard deviation of the mean.

• About 95% (more precisely, 95.4%) of the data points fall within 2 standard deviations of the mean.

• About 99.7% of the data points fall within 3 standard deviations of the mean.

The 68-95-99.7 Rule for a Normal Distribution

Page 21: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 21Copyright © 2009 Pearson Education, Inc.

The tests that make up the verbal (critical reading) and mathematics SAT (and the GRE, LSAT, and GMAT) are designed so that their scores are normally distributed with a mean of = 500 and a standard deviation of = 100. Interpret this statement.

EXAMPLE 1 SAT Scores

Solution: From the 68-95-99.7 rule, about 68% of students have scores within 1 standard deviation (100 points) of the mean of 500 points; that is, about 68% of students score between 400 and 600. About 95% of students score within 2 standard deviations (200 points) of the mean, or between 300 and 700.

And about 99.7% of students score within 3 standard deviations (300 points) of the mean, or between 200 and 800.

Page 22: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 22Copyright © 2009 Pearson Education, Inc.

Solution: (cont.) Figure 5.18 shows this interpretation graphically; note that the horizontal axis shows both actual scores and distance from the mean in standard deviations.

EXAMPLE 1 SAT Scores

Figure 5.18 Normal distribution for SAT scores, showing the percentages associated with 1, 2, and 3 standard deviations.

Page 23: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 23Copyright © 2009 Pearson Education, Inc.

Vending machines can be adjusted to reject coins above and below certain weights. The weights of legal U.S. quarters have a normal distribution with a mean of 5.67 grams and a standard deviation of 0.0700 gram. If a vending machine is adjusted to reject quarters that weigh more than 5.81 grams and less than 5.53 grams, what percentage of legal quarters will be rejected by the machine?

Solution: A weight of 5.81 is 0.14 gram, or 2 standard deviations, above the mean. A weight of 5.53 is 0.14 gram, or 2 standard deviations, below the mean. Therefore, by acceptingonly quarters within the weight range 5.53 to 5.81 grams, the machine accepts quarters that are within 2 standard deviations of the mean and rejects those that are more than 2 standarddeviations from the mean. By the 68-95-99.7 rule, 95% of legal quarters will be accepted and 5% of legal quarters will be rejected.

EXAMPLE 2 Detecting Counterfeits

Page 24: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 24Copyright © 2009 Pearson Education, Inc.

Applying the 68-95-99.7 Rule

We can apply the 68-95-99.7 rule to determine when data values lie 1, 2, or 3 standard deviations from the mean.

For example, suppose that 1,000 students take an exam and the scores are normally distributed with a mean of = 75 and a standard deviation of = 7.

Page 25: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 25Copyright © 2009 Pearson Education, Inc.

Figure 5.19 A normal distribution of test scores with a mean of 75 and a standard deviation of 7. (a) 68% of the scores lie within 1 standard deviation of the mean. (b) 95% of the scores lie within 2 standard deviations of the mean.

Page 26: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 26Copyright © 2009 Pearson Education, Inc.

Identifying Unusual Results

In statistics, we often need to distinguish values that are typical, or “usual,” from values that are “unusual.” By applying the 68-95-99.7 rule, we find that about 95% of all values from a normal distribution lie within 2 standard deviations of the mean.

This implies that, among all values, 5% lie more than 2 standard deviations away from the mean. We can use this property to identify values that are relatively “unusual”:

Unusual values are values that are more than 2 standard deviations away from the mean.

Page 27: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 27Copyright © 2009 Pearson Education, Inc.

You measure your resting heart rate at noon every day for a year and record the data. You discover that the data have a normal distribution with a mean of 66 and a standard deviation of 4. On how many days was your heart rate below 58 beats per minute?

Solution: A heart rate of 58 is 8 (or 2 standard deviations) below the mean. According to the 68-95-99.7 rule, about 95% of the data points are within 2 standard deviations of the mean. Therefore, 2.5% of the data points are more than 2 standard deviations below the mean, and 2.5% of the data points are more than 2 standard deviations above the mean. On 2.5% of 365 days, or about 9 days, your measured heart rate was below 58 beats per minute.

EXAMPLE 4 Normal Heart Rate

Page 28: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 28Copyright © 2009 Pearson Education, Inc.

On a visit to the doctor’s office, your fourth-grade daughter is told that her height is 1 standard deviation above the mean for her age and sex. What is her percentile for height? Assume that heights of fourth-grade girls are normally distributed.

Solution: Recall that a data value lies in the nth percentile of a distribution if n% of the data values are less than or equal to it (see Section 4.3). According to the 68-95-99.7 rule, 68% of the heights are within 1 standard deviation of the mean. Therefore, 34% of the heights (half of 68%) are between 0 and 1 standard deviation above the mean. We also know that, because the distribution is symmetric, 50% of all heights are below the mean. Therefore, 50% + 34% = 84% of all heights are less than 1 standard deviation above the mean (Figure 5.21). Your daughter is in the 84th percentile for heights among fourth-grade girls.

EXAMPLE 5 Finding a Percentile

Page 29: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 29Copyright © 2009 Pearson Education, Inc.

Figure 5.21 Normal distribution curve showing 84% of scores less than 1 standard deviation above the mean.

Page 30: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 30Copyright © 2009 Pearson Education, Inc.

Standard Scores

Computing Standard Scores

The number of standard deviations a data value lies above or below the mean is called its standard score (or z-score), defined by

z = standard score =

The standard score is positive for data values above the mean and negative for data values below the mean.

data value – meanstandard deviation

Page 31: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 31Copyright © 2009 Pearson Education, Inc.

The Stanford-Binet IQ test is scaled so that scores have a mean of 100 and a standard deviation of 16. Find the standard scores for IQs of 85, 100, and 125.

Solution: We calculate the standard scores for these IQs by using the standard score formula with a mean of 100 and standard deviation of 16.

standard score for 125: z = = -0.94

standard score for 100: z = = 0.00

EXAMPLE 6 Finding Standard Scores

85 – 10016

100 – 10016

Page 32: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 32Copyright © 2009 Pearson Education, Inc.

Solution: (cont.)

standard score for 125: z = = 1.56

We can interpret these standard scores as follows: 85 is 0.94 standard deviation below the mean, 100 is equal to the mean, and 125 is 1.56 standard deviations above the mean.

EXAMPLE 6 Finding Standard Scores

125 – 10016

Page 33: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 33Copyright © 2009 Pearson Education, Inc.

Figure 5.22 shows the values on the distribution of IQ scores from Example 6.

Figure 5.22 Standard scores for IQ scores of 85, 100, and 125.

Page 34: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 34Copyright © 2009 Pearson Education, Inc.

Standard Scores and Percentiles

Once we know the standard score of a data value, the properties of the normal distribution allow us to find its percentile in the distribution. This is usually done with a standard score table, such as Table 5.1 (next slide).

(Appendix A has a more detailed standard score table.)

Page 35: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 35Copyright © 2009 Pearson Education, Inc.

Page 36: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 36Copyright © 2009 Pearson Education, Inc.

Cholesterol levels in men 18 to 24 years of age are normally distributed with a mean of 178 and a standard deviation of 41.

a. What is the percentile for a 20-year-old man with a cholesterol level of 190?

EXAMPLE 7 Cholesterol Levels

Solution: a.The standard score for a cholesterol level of 190 is

z = standard score = = ≈ 0.29

Table 5.1 shows that a standard score of 0.29 corresponds to about the 61st percentile.

190 – 17841

data value – mean standard deviation

Page 37: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Solution:b. Table 5.1 shows that 90.32% of all data values have a standard score less than 1.3. Thus, the 90th percentile is about 1.3 standard deviations above the mean. Given the mean cholesterol level of 178 and the standard deviation of 41, a cholesterol level 1.3 standard deviations above the mean is

A cholesterol level of about 231 corresponds to the 90th percentile. Slide 5.2- 37Copyright © 2009 Pearson Education, Inc.

Cholesterol levels in men 18 to 24 years of age are normally distributed with a mean of 178 and a standard deviation of 41.

b. What cholesterol level corresponds to the 90th percentile, the level at which treatment may be necessary?

EXAMPLE 7 Cholesterol Levels

Page 38: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.2- 38Copyright © 2009 Pearson Education, Inc.

Toward Probability Suppose you pick a baby at random and ask whether the baby was born more than 15 days prior to his or her due date. Because births are normally distributed around the due date with a standard deviation of 15 days, we know that 16% of all births occur more than 15 days prior to the due date (see Example 3).

For an individual baby chosen at random, we can therefore say that there’s a 0.16 chance (about 1 in 6) that the baby was born more than 15 days early.

In other words, the properties of the normal distribution allow us to make a probability statement about an individual. In this case, our statement is that the probability of a birth occurring more than 15 days early is 0.16.

This example shows that the properties of the normal distribution can be restated in terms of ideas of probability.

Page 39: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

THE CENTRAL LIMIT THEOREM

Section 5.3

Slide 1.1- 39Copyright © 2009 Pearson Education, Inc.

Page 40: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 40Copyright © 2009 Pearson Education, Inc.

Suppose we roll one die 1,000 times and record the outcome of each roll, which can be the number 1, 2, 3, 4, 5, or 6.

Figure 5.23 shows a histogram of outcomes. All six outcomes have roughly the same relative frequency, because the die is equally likely to land in each of the six possible ways. That is, the histogram shows a (nearly) uniform distribution (see Section 4.2).

It turns out that the distribution in Figure 5.23 has a mean of 3.41 and a standard deviation of 1.73.

Figure 5.23 Frequency and relative frequency distributionof outcomes from rolling one die 1,000 times.

Page 41: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 41Copyright © 2009 Pearson Education, Inc.

Now suppose we roll two dice 1,000 times and record the mean of the two numbers that appear on each roll. To find the mean for a single roll, we add the two numbers and divide by 2.

Figure 5.25a shows a typical result. The most common values in this distribution are the central values 3.0, 3.5, and 4.0. These values are common because they can occur in several ways.

The mean and standard deviation for this distribution are 3.43 and 1.21, respectively.

Figure 5.25a Frequency and relative frequency distribution of sample means from rolling two dice 1,000 times.

Page 42: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 42Copyright © 2009 Pearson Education, Inc.

Suppose we roll five dice 1,000times and record the mean of the five numbers on each roll. A histogram for this experiment isshown in Figure 5.25b.

Once again we see that the central values around 3.5 occur most frequently, but the spread of the distribution is narrower than in the two previous cases.

The mean and standard deviation are 3.46 and 0.74, respectively.

Figure 5.25b Frequency and relative frequency distribution of sample means from rolling five dice 1,000 times.

Page 43: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 43Copyright © 2009 Pearson Education, Inc.

If we further increase the number of dice to ten on each of 1,000 rolls, we find the histogram in Figure 5.25c, which is even narrower.

In this case, the mean is 3.49 and standard deviation is 0.56.

Figure 5.25c Frequency and relative frequency distribution of sample means from rolling ten dice 1,000 times.

Page 44: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 44Copyright © 2009 Pearson Education, Inc.

Table 5.2 shows that as the sample size increases, the mean of the distribution of means approaches the value 3.5 and the standard deviation becomes smaller (making the distribution narrower).

More important, the distribution looks more and more like a normal distribution as the sample size increases.

Page 45: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 45Copyright © 2009 Pearson Education, Inc.

The Central Limit Theorem

1. The distribution of means will be approximately a normal distribution for large sample sizes.

2. The mean of the distribution of means approaches the population mean, , for large sample sizes.

3. The standard deviation of the distribution of means approaches for large sample sizes, where is the standard deviation of the population.

σ/ n

Suppose we take many random samples of size n for a variable with any distribution (not necessarily a normal distribution) and record the distribution of the means of each sample. Then,

Page 46: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 46Copyright © 2009 Pearson Education, Inc.

Be sure to note the very important adjustment, described by item 3 above, that must be made when working with

samples or groups instead of individuals:

The standard deviation of the distribution of sample means is not the standard deviation of the population, ,

but rather , where n is the size of the samples. n

Page 47: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 47Copyright © 2009 Pearson Education, Inc.

TECHNICAL NOTE

(1) For practical purposes, the distribution of means will be nearly normal if the sample size is larger than 30.

(2) If the original population is normally distributed, then the sample means will be normally distributed for any sample size n.

(3) In the ideal case, where the distribution of means is formed from all possible samples, the mean of the distribution of means equals μ and the standard deviation of the distribution of means equals .

σ/ n

Page 48: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 48Copyright © 2009 Pearson Education, Inc.

Figure 5.26 As the sample size increases (n = 5, 10, 30), the distribution of sample means approaches a normal distribution, regardless of the shape of the original distribution. The larger the sample size, the smaller is the standard deviation of the distribution of sample means.

Page 49: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 49Copyright © 2009 Pearson Education, Inc.

You are a middle school principal and your 100 eighth-graders are about to take a national standardized test. The test is designed so that the mean score is = 400 with a standard deviation of = 70. Assume the scores are normally distributed.

a. What is the likelihood that one of your eighth-graders, selected at random, will score below 375 on the exam?

Solution:a. In dealing with an individual score, we use the method of

standard scores discussed in Section 5.2. Given the mean of 400 and standard deviation of 70, a score of 375 has a standard score of

z = = = -0.36

EXAMPLE 1 Predicting Test Scores

data value – meanstandard deviation

375 – 400 70

Page 50: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 50Copyright © 2009 Pearson Education, Inc.

According to Table 5.1, a standard score of -0.36 corresponds to about the 36th percentile— that is, 36% of all students can be expected to score below 375. Thus, there is about a 0.36 chance that a randomly selected student will score below 375.

Notice that we need to know that the scores have a normal distribution in order to make this calculation, because the table of standard scores applies only to normal distributions.

EXAMPLE 1 Predicting Test Scores

Solution: (cont.)

Page 51: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 51Copyright © 2009 Pearson Education, Inc.

You are a middle school principal and your 100 eighth-graders are about to take a national standardized test. The test is designed so that the mean score is = 400 with a standard deviation of = 70. Assume the scores are normally distributed.

b. Your performance as a principal depends on how well your entire group of eighth-graders scores on the exam. What is the likelihood that your group of 100 eighth-graders will have a mean score below 375?

Solution:b. The question about the mean of a group of students must be

handled with the Central Limit Theorem. According to this theorem, if we take random samples of size n = 100 students and compute the mean test score of each group, the distribution of means is approximately normal.

EXAMPLE 1 Predicting Test Scores

Page 52: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 52Copyright © 2009 Pearson Education, Inc.

Moreover, the mean of this distribution is = 400 and its standard deviation is = 70/ 100 = 7. With these values for the mean and standard deviation, the standard score for a mean test score of 375 is

EXAMPLE 1 Predicting Test Scores

Solution: (cont.)

data value – meanstandard deviation

375 – 400 7

z = = = -0.357

Table 5.1 shows that a standard score of -3.5 corresponds to the 0.02th percentile, and the standard score in this case is even lower.

In other words, fewer than 0.02% of all random samples of 100 students will have a mean score of less than 375.

n/

Page 53: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 53Copyright © 2009 Pearson Education, Inc.

Therefore, the chance that a randomly selected group of 100 students will have a mean score below 375 is less than 0.0002, or about 1 in 5,000.

Notice that this calculation regarding the group mean did not depend on the individual scores’ having a normal distribution.

EXAMPLE 1 Predicting Test Scores

Solution: (cont.)

This example has an important lesson. The likelihood of an individual scoring below 375 is more than 1 in 3 (36%), but the likelihood of a group of 100 students having a mean score below 375 is less than 1 in 5,000 (0.02%).

In other words, there is much more variation in the scores of individuals than in the means of groups of individuals.

Page 54: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Slide 5.3- 54Copyright © 2009 Pearson Education, Inc.

The Value of the Central Limit Theorem

The Central Limit Theorem allows us to say something about the mean of a group if we know the mean, , and the standard deviation, , of the entire population. This can be useful, but it turns out that the opposite application is far more important.

Two major activities of statistics are making estimates of population means and testing claims about population means. Is it possible to make a good estimate of the population mean knowing only the mean of a much smaller sample?

As you can probably guess, being able to answer this type of question lies at the heart of statistical sampling, especially in polls and surveys. The Central Limit Theorem provides the key to answering such questions.

Page 55: MM 207 Unit #5 The Normal Distribution Slide 1.1- 1 Copyright © 2009 Pearson Education, Inc

Copyright © 2009 Pearson Education, Inc.

Q & A???

Slide 5.3- 55