42
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods

Chapter 2

  • Upload
    meli

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Chapter 2. Descriptive Statistics: Tabular and Graphical Methods. Descriptive Statistics. 2.1Graphically Summarizing Qualitative Data 2.2Graphically Summarizing Quantitative Data 2.3Dot Plots 2.4Stem-and-Leaf Displays 2.5 Crosstabulation Tables ( Optional ). - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 2

McGraw-Hill/Irwin

Copyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved.

Chapter 2

Descriptive Statistics: Tabular and Graphical Methods

Page 2: Chapter 2

2-2

Descriptive Statistics

2.1 Graphically Summarizing Qualitative Data

2.2 Graphically Summarizing Quantitative Data

2.3 Dot Plots

2.4 Stem-and-Leaf Displays

2.5 Crosstabulation Tables (Optional)

Page 3: Chapter 2

2-3

Descriptive Statistics Continued

2.6 Scatter Plots (Optional)

2.7 Misleading Graphs and Charts (Optional)

Page 4: Chapter 2

2-4

Graphically Summarizing Qualitative Data

• Frequency distribution: A table that summarizes the number of items in each of several non-overlapping classes.

• The purpose is to make the data easier to understand.

• With qualitative data, the possible values naturally identify the different categories. For example, in the dealership data, the vehicle models sold.

Page 5: Chapter 2

2-5

Example 2.1: Describing 2006 Jeep Purchasing Patterns

• Table 2.1 lists all 251 vehicles sold in 2006 by the greater Cincinnati Jeep dealers

• Table 2.1 does not reveal much useful information

• A frequency distribution is a useful summary

– Simply count the number of times each model appears in Table 2.1

Page 6: Chapter 2

2-6

The Resulting Frequency Distribution

Jeep Model Frequency

Commander 71

Grand Cherokee 70

Liberty 80

Wrangler 30

251

Page 7: Chapter 2

2-7

Relative Frequency and Percent Frequency

• Relative frequency summarizes the proportion of items in each class

• For each class, divide the frequency of the class by the total number of observations

• Multiply times 100 to obtain the percent frequency

Page 8: Chapter 2

2-8

The Resulting Relative Frequency and Percent Frequency Distribution

Jeep ModelRelative

FrequencyPercent

Frequency

Commander 0.2829 28.29%

Grand Cherokee 70/251

Liberty 80/251

Wrangler 30/251

1.0000 100.00%

Page 9: Chapter 2

2-9

Bar Charts and Pie Charts

• Bar chart: A vertical or horizontal rectangle represents the frequency for each category

– Height can be frequency, relative frequency, or percent frequency

• Pie chart: A circle divided into slices where the size of each slice represents its relative frequency or percent frequency

Page 10: Chapter 2

2-10

Excel Bar Chart of the Jeep Sales Data

Page 11: Chapter 2

2-11

Excel Pie Chart of the Jeep Sales Data

Page 12: Chapter 2

2-12

Pareto Chart

• Pareto chart: “In many economies, most of the wealth is held by a small minority of the population.” “Most of the defects are caused by a small portion of reasons” 80-20 law

– Bar height represents the frequency of occurrence

– Bars are arranged in decreasing height from left to right

Page 13: Chapter 2

2-13

• Pareto charts are typically used to prioritize competing or conflicting "problems," so that resources are allocated to the most significant areas.

• In general, though, they can be used to determine which of several classifications have the most "count" or cost associated with them. For instance, the number of people using the various ATM's vs. each of the indoor teller locations, or the profit generated from each of twenty product lines. The important limitations are that the data must be in terms of either counts or costs. The data can not be in terms that can't be added, such as percent yields or error rates.

Page 14: Chapter 2

2-14

Excel Frequency Table and Pareto Chart of Labeling Defects

Page 15: Chapter 2

2-15

Graphically Summarizing Quantitative Data

• Often need to summarize and describe the shape of the distribution

• One way is to group the measurements into classes of a frequency distribution and then displaying the data in the form of a histogram

Page 16: Chapter 2

2-16

Frequency Distribution

• A frequency distribution is a list of data classes, non-overlapping intervals, with the count of values that belong to each class. The frequency distribution is organized as a table.

– “tally and count”

• Show the frequency distribution in a histogram

– The histogram is a picture of the frequency distribution, a special bar chart for quantitative data, with no gaps between the bars.

Page 17: Chapter 2

2-17

Constructing a Frequency Distribution

Steps in making a frequency distribution:

1. Find the number of classes

2. Find the class length

3. Form non-overlapping classes of equal width

4. Tally and count

5. Graph the histogram

Given the non-overlapping classes you

should know how to tally and count and

make the histogram.

Page 18: Chapter 2

2-18

Example 2.2 The Payment Time Case: A Sample of Payment Times

22 29 16 15 18 17 12 13 17 16 15

19 17 10 21 15 14 17 18 12 20 14

16 15 16 20 22 14 25 19 23 15 19

18 23 22 16 16 19 13 18 24 24 26

13 18 17 15 24 15 17 14 18 17 21

16 21 25 19 20 27 16 17 16 21

Table 2.4Table 2.4

Page 19: Chapter 2

2-19

Make a histogram

• Find the number of classes

• Find the class length

• Constructing nonoverlapping classes of equal length

• Tally and count the number of entries in each class

• Graph the histogram.

Page 20: Chapter 2

2-20

Number of Classes

• Group all of the n data into K number of classes

• K is the smallest whole number for which 2K n

• In Examples 2.2 n = 65

– For K = 6, 26 = 64, < n

– For K = 7, 27 = 128, > n

– So use K = 7 classes

Page 21: Chapter 2

2-21

Class Length

• Find the length of each class as the largest measurement minus the smallest divided by the number of classes found earlier (K)

• Always round up to the same level of precision as the data

• For Example 2.2, (29-10)/7 = 2.7143

– Because payments measured in days, round to three days

Page 22: Chapter 2

2-22

Form Non-Overlapping Classes of Equal Width

• The classes start on the smallest value

– This is the lower limit of the first class

• The upper limit of the first class is smallest value + class length

– In the example, the first class starts at 10 days and goes up to 13 days

• The next class starts at this upper limit and goes up by class length

• And so on

Page 23: Chapter 2

2-23

Seven Non-Overlapping Classes Payment Time Example

Class 1 10 days and less than 13 days

Class 2 13 days and less than 16 days

Class 3 16 days and less than 19 days

Class 4 19 days and less than 22 days

Class 5 22 days and less than 25 days

Class 6 25 days and less than 28 days

Class 7 28 days and less than 31 days

Page 24: Chapter 2

2-24

Tally and Count the Number of Measurements in Each Class

ClassFrequency Relative

Frequency

10 ≤ x< 13

13 ≤ x< 16

16≤ x< 19

19 ≤ x< 22

22 ≤ x< 25

25 ≤ x< 28

28 ≤ x< 31

Page 25: Chapter 2

2-25

Histogram

• Rectangles represent the classes

• The base represents the class length and limits

• The height represents

– the frequency in a frequency histogram, or

– the relative frequency in a relative frequency histogram

Page 26: Chapter 2

2-26

Histograms

Frequency Histogram Relative Frequency Histogram

Page 27: Chapter 2

2-27

Some Common Distribution Shapes

• Skewed to the right: The right tail of the histogram is longer than the left tail

• Skewed to the left: The left tail of the histogram is longer than the right tail

• Symmetrical: The right and left tails of the histogram appear to be mirror images of each other

Page 28: Chapter 2

2-28

Page 29: Chapter 2

2-29

Mound-shaped or bell-shapeddistribution vs non mound-shaped distribution.

Symmetric Distribution

Page 30: Chapter 2

2-30

Cumulative Distributions

• Another way to summarize a distribution is to construct a cumulative distribution

• To do this, use the same number of classes, class lengths, and class boundaries used for the frequency distribution

• Rather than a count, we record the number of measurements that are less than the upper boundary of that class. The cumulative count is the sum of the count of current class and the counts of all previous classes.

– In other words, a running total

Page 31: Chapter 2

2-31

Frequency, Cumulative Frequency, and Cumulative Relative Frequency Distribution

Class FrequencyCumulative Frequency

Cumulative Relative

Frequency

Cumulative Percent

Frequency

10 < 13 3 3 3/65=0.0462 4.62%

13 < 16 14 17 17/65=0.2615 26.15%

16 < 19 23 40 0.6154 61.54%

19 < 22 12 52 0.8000 80.00%

22 < 25 8 60 0.9231 92.31%

25 < 28 4 64 0.9846 98.46%

28 < 31 1 65 1.0000 100.00%

Page 32: Chapter 2

2-32

Stem-and-Leaf Display

• Purpose is to see the overall pattern of the data, by grouping the data into classes

– the variation from class to class

– the amount of data in each class

– the distribution of the data within each class

• Best for small to moderately sized data distributions

Page 33: Chapter 2

2-33

Car Mileage Example

• Refer to the Car Mileage Case

– Data in Table 2.14; all digits except the last one, leaf is the last digit

• The stem-and-leaf display:

29 8

30 13455677888

31 0012334444455667778899

32 01112334455778

33 0333 + 0.3 = 33.3

29 + 0.8 = 29.8

Page 34: Chapter 2

2-34

Constructing a Stem-and-Leaf Display

• Can split the stems as needed

• For example you can divide one stem into the lower part, which only contains the leaves of ‘0’ ‘1’ ‘2’ ‘3’ ‘4’, and the upper part, which only contains the leaves of ‘5’ ‘6’ ‘7’ ‘8’ ‘9’.

Page 35: Chapter 2

2-35

Split Stems from Car Mileage Example

• Starred classes (*) extend from 0.0 to 0.4

• Unstarred classes extend from 0.5 to 09

29 8

30* 134

30 55677888

31* 00123344444

31 55667778899

32* 011123344

32 55778

33* 03

Page 36: Chapter 2

2-36

Comparing Two Distributions

• To compare two distributions, can construct a back-to-back stem-and-leaf display (or histogram)

• Uses the same stems for both

• One leaf is shown on the left side and the other on the right

Page 37: Chapter 2

2-37

Sample Back-to-Back Stem-and-Leaf Display

Page 38: Chapter 2

2-38

Scatter Plots (Optional)

• Used to study relationships between two variables

• Place one variable on the x-axis

• Place a second variable on the y-axis

• Place dot on pair coordinates

Page 39: Chapter 2

2-39

Types of Relationships

• Linear: A straight line relationship between the two variables

• Positive: When one variable goes up, the other variable goes up

• Negative: When one variable goes up, the other variable goes down

• No Linear Relationship: There is no coordinated linear movement between the two variables

Page 40: Chapter 2

2-40

A Scatter Plot Showing a Positive Linear Relationship

Page 41: Chapter 2

2-41

A Scatter Plot Showing a Little or No Linear Relationship

Page 42: Chapter 2

2-42

A Scatter Plot Showing a Negative Linear Relationship