37
Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

  • View
    223

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Chapter 3Numerically

Summarizing Data3.3

Measures of Central Tendency and Dispersion from Grouped

Data

Page 2: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 3: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 4: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

EXAMPLE Approximating the Mean from a Frequency Distribution

The following frequency distribution represents the time between eruptions (in seconds) for a random sample of 45 eruptions at the Old Faithful Geyser in California. Approximate the mean time between eruptions.

Page 5: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 6: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 7: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

EXAMPLE Computed a Weighted Mean

Bob goes the “Buy the Weigh” Nut store and creates his own bridge mix. He combines 1 pound of raisins, 2 pounds of chocolate covered peanuts, and 1.5 pounds of cashews. The raisins cost $1.25 per pound, the chocolate covered peanuts cost $3.25 per pound, and the cashews cost $5.40 per pound. What is the cost per pound of this mix.

Page 8: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 9: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 10: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

EXAMPLE Approximating the Mean from a Frequency Distribution

The following frequency distribution represents the time between eruptions (in seconds) for a random sample of 45 eruptions at the Old Faithful Geyser in California. Approximate the standard deviation time between eruptions.

Page 11: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 12: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Chapter 3Numerically

Summarizing Data

3.4

Measures of Location

Page 13: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

The z-score represents the number of standard deviations that a data value is from the mean.

It is obtained by subtracting the mean from the data value and dividing this result by the standard deviation.

The z-score is unitless with a mean of 0 and a standard deviation of 1.

Page 14: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Population Z - score

Sample Z - score

Page 15: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

EXAMPLE Using Z-Scores

The mean height of males 20 years or older is 69.1 inches with a standard deviation of 2.8 inches. The mean height of females 20 years or older is 63.7 inches with a standard deviation of 2.7 inches. Data based on information obtained from National Health and Examination Survey. Who is relatively taller:

Shaquille O’Neal whose height is 85 inches

or

Lisa Leslie whose height is 77 inches.

Page 16: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Answer:

Shaquille O’Neal Z-Score: (85-69.1)/2.8 =5.67857143

Lisa Leslie (77-63.7)/2.7 =4.92592593 Because O’Neal Z-Score > Lisa ‘s Z-Score,We say O’Neal is in a higher position than

Lisa in their Goups.

Page 17: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

The median divides the lower 50% of a set of data from the upper 50% of a set of data. In general, the kth percentile, denoted Pk , of a set of data divides the lower k% of a data set from the upper (100 – k) % of a data set.

Page 18: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Computing the kth Percentile, Pk

Step 1: Arrange the data in ascending order.

Page 19: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Step 1: Arrange the data in ascending order.

Step 2: Compute an index i using the following formula:

where k is the percentile of the data value and n is the number of individuals in the data set.

Computing the kth Percentile, Pk

Page 20: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Step 1: Arrange the data in ascending order.

Step 2: Compute an index i using the following formula:

where k is the percentile of the data value and n is the number of individuals in the data set.

Step 3: (a) If i is not an integer, round up to the next highest integer. Locate the ith value of the data set written in ascending order. This number represents the kth percentile. (b) If i is an integer, the kth percentile is the arithmetic mean of the ith and (i + 1)st data value.

Computing the kth Percentile, Pk

Page 21: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

EXAMPLE Finding a Percentile

For the employment ratio data on the next slide, find the

(a) 60th percentile

(b) 33rd percentile

Page 22: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Answer: A) 60th Percentile i) the index: I = (60/100)*51 =30.6

30.6 in not an integer, we round it up to 31. so the data value is 66.1

B) 33rd

i) the index: I =(33/100)*51=16.83 Round it up to 17. So the data value at 17th is

63.6.

Page 23: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 24: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data
Page 25: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Finding the Percentile that Corresponds to a Data Finding the Percentile that Corresponds to a Data ValueValue

Step 1: Arrange the data in ascending order.

Page 26: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Step 2: Use the following formula to determine the percentile of the score, x:

Percentile of x =

Round this number to the nearest integer.

Finding the Percentile that Corresponds to a Data Finding the Percentile that Corresponds to a Data ValueValue

Step 1: Arrange the data in ascending order.

Page 27: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

EXAMPLE Finding the Percentile Rank of a Data Value

Find the percentile rank of the employment ratio of Michigan.

Page 28: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

The most common percentiles are quartiles. Quartiles divide data sets into fourths or four equal parts.

• The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile.

Page 29: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

The most common percentiles are quartiles. Quartiles divide data sets into fourths or four equal parts.

• The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile.

• The 2nd quartile divides the bottom 50% of the data from the top 50% of the data, so that the 2nd quartile is equivalent to the 50th percentile, which is equivalent to the median.

Page 30: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

The most common percentiles are quartiles. Quartiles divide data sets into fourths or four equal parts.

• The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%. Therefore, the 1st quartile is equivalent to the 25th percentile.

• The 2nd quartile divides the bottom 50% of the data from the top 50% of the data, so that the 2nd quartile is equivalent to the 50th percentile, which is equivalent to the median.

• The 3rd quartile divides the bottom 75% of the data from the top 25% of the data, so that the 3rd quartile is equivalent to the 75th percentile.

Page 31: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

EXAMPLE Finding the Quartiles

Find the quartiles corresponding to the employment ratio data.

Page 32: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Checking for Outliers Using QuartilesStep 1: Determine the first and third quartiles of the data.

Page 33: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Step 1: Determine the first and third quartiles of the data.

Step 2: Compute the interquartile range. The interquartile range or IQR is the difference between the third and first quartile. That is, IQR = Q3 - Q1

Checking for Outliers Using Quartiles

Page 34: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Step 3: Compute the fences that serve as cut-off points for outliers.

Lower Fence = Q1 - 1.5(IQR)

Upper Fence = Q3 + 1.5(IQR)

Step 1: Determine the first and third quartiles of the data.

Step 2: Compute the interquartile range. The interquartile range or IQR is the difference between the third and first quartile. That is, IQR = Q3 - Q1

Checking for Outliers Using Quartiles

Page 35: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

Step 3: Compute the fences that serve as cut-off points for outliers.

Lower Fence = Q1 - 1.5(IQR)

Upper Fence = Q3 + 1.5(IQR)

Step 4: If a data value is less than the lower fence or greater than the upper fence, then it is considered an outlier.

Step 1: Determine the first and third quartiles of the data.

Step 2: Compute the interquartile range. The interquartile range or IQR is the difference between the third and first quartile. That is,

Checking for Outliers Using Quartiles

IQR = Q3 - Q1

Page 36: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

EXAMPLE Check the employment ratio data for outliers.

Q1:13 th—62.9 ;

Q3: 38th—67.2

Q3-Q1=4.3

So (62.9-1.5*4.3, 67.2+1.5*4.3)=(56.45,73.65)

The OUTLIER is 52.7

Page 37: Chapter 3 Numerically Summarizing Data 3.3 Measures of Central Tendency and Dispersion from Grouped Data

West Virginia