Upload
emerson-tisdale
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Copyright © 2014 Pearson Education, Inc. All rights reserved
Chapter 2
Picturing Variation with Graphs
2 - 2 Copyright © 2014 Pearson Education, Inc. All rights reserved
Learning Objectives
Understand that a distribution of a sample of data displays a variable’s values and the frequencies (or relative frequencies) of those values.
Know how to make graphs of distributions of numerical and categorical variables and how to interpret the graphs in context.
Be able to compare centers and spreads of distributions of samples informally.
Copyright © 2014 Pearson Education, Inc. All rights reserved
2.1
Visualizing Variation in Numerical Data
2 - 4 Copyright © 2014 Pearson Education, Inc. All rights reserved
Visualizing Statistics
Organize the data using the chart that most effectively visually summarizes the data.
The distribution of the data describes the values, frequencies (counts), and “shape” of the data. Is there a data value or data values that are far
from the rest of the data? Is there symmetry? Is there a most common value or most common
range of values?
2 - 5 Copyright © 2014 Pearson Education, Inc. All rights reserved
Dot Plots
A Dot Plot is a chart that contains a dot for each data value.
Benefits Shows the individual data values Easy to spot outliers Describes the distribution visually
Drawbacks Not as common as bar and pie charts Not great for data that has too many individual values
2 - 6 Copyright © 2014 Pearson Education, Inc. All rights reserved
Dot Plot Example
Clearly shows the outlier just below $300. The rest of the data is generally uniformly
spread out.
2 - 7 Copyright © 2014 Pearson Education, Inc. All rights reserved
Frequency Histograms
A histogram is a type of bar graph. The horizontal axis is numerical. The vertical axis represents the frequency of
the data. Groups the data into bins, also called
intervals or classes. Easy to visualize the distribution.
2 - 8 Copyright © 2014 Pearson Education, Inc. All rights reserved
Histogram Example
Different bin widths depict the same data differently.
The smaller width shows more detail. Too small a width shows too much detail and
will not clearly display the main features.
2 - 9 Copyright © 2014 Pearson Education, Inc. All rights reserved
Relative Frequency Histograms
A Relative Frequency Histogram is a histogram where the vertical axis represents the relative frequencies, or percents, rather than the frequencies.
Compute the relative frequency by dividing the frequency by the sample size.
The relative frequency histogram always has the same shape as the frequency histogram. The scale of the vertical axis is just changed.
2 - 10 Copyright © 2014 Pearson Education, Inc. All rights reserved
Relative Frequency Example
Clearly shows that half of all women score on average between 0.7 and 0.8 goals per game.
Shows there are a small number of exceptional players.
Women’s Soccer Players, NCAA Division III 2009
2 - 11 Copyright © 2014 Pearson Education, Inc. All rights reserved
Frequency vs. Relative Frequency Histograms
Use a frequency histogram when you want to emphasize how many are in each range.
Use a relative frequency histogram when you want to emphasize what proportion or percent of the total each range contains.
2 - 12 Copyright © 2014 Pearson Education, Inc. All rights reserved
Stem and Leaf Plots
The Leaf is the last digit The Stem contains all digits before the last
digit Shows individual data values Same as a histogram, but bin width a power
of 10 Example: The five 0’s show that there were
five classes with 40 students.
Class Size
Copyright © 2014 Pearson Education, Inc. All rights reserved
2.2
Summarizing Important Features of a Numerical Distribution
2 - 14 Copyright © 2014 Pearson Education, Inc. All rights reserved
Three Aspects of a Distribution
Shape Symmetry How a many bumps or modes? Other distinguishing features
Center What is a typical value?
Spread Is the data all close together or spread out?
2 - 15 Copyright © 2014 Pearson Education, Inc. All rights reserved
Skewness
A distribution is Skewed Right if most of the data values are small and there is a “tail” of larger values to the right.
A distribution is Skewed Left if most of the data values are large and there is a “tail” of smaller values to the left.
2 - 16 Copyright © 2014 Pearson Education, Inc. All rights reserved
Symmetric Distributions
A distribution is symmetric if the left hand side is roughly the mirror image of the right hand side.
Symmetric Distributions
2 - 17 Copyright © 2014 Pearson Education, Inc. All rights reserved
How Many Mounds
A Unimodal distribution has one mound.
A Multimodal distribution has more than two mounds.
A Bimodal distribution has two mounds.
2 - 18 Copyright © 2014 Pearson Education, Inc. All rights reserved
Normal Distributions
A Normal distribution has the following properties Symmetric Unimodal Mound or Bell Shaped
2 - 19 Copyright © 2014 Pearson Education, Inc. All rights reserved
Outliers
An Outlier is a data value that is either much smaller or much larger than the rest of the data.
Some reasons for outliers Error in data collection No error. For example, the owner’s salary could
be an outlier if the rest of the employees are all low wage workers
2 - 20 Copyright © 2014 Pearson Education, Inc. All rights reserved
Center
What is a typical value? Center not a typical value for bimodal or
skewed.
2 - 21 Copyright © 2014 Pearson Education, Inc. All rights reserved
Variability
Variability describes how spread out the data value are.
2 - 22 Copyright © 2014 Pearson Education, Inc. All rights reserved
Summary of Describing a Distribution
What is the shape? Is it Symmetric, Skewed, or Neither? Unimodal, Bimodal, or Multimodal? Normal? Are there outliers?
Where is the center? Is the center a typical value?
Is there low or high variability?
Copyright © 2014 Pearson Education, Inc. All rights reserved
2.3
Visualizing Variation in Categorical
Variables
2 - 24 Copyright © 2014 Pearson Education, Inc. All rights reserved
Two Types of Charts
A Bar Chart is like a histogram, but the horizontal axis can represent categorical data. A natural order may not occur.
A Pie Chart is a circle cut into slices where the size of each slice is proportional to the frequency of the outcome that it represents.
2 - 25 Copyright © 2014 Pearson Education, Inc. All rights reserved
The frequency table below shows the ranks of a group of army members who live in the barracks.
Example: Categorical Data
Rank Private Corporal Sergeant Major
Frequency 65 22 49 16
2 - 26 Copyright © 2014 Pearson Education, Inc. All rights reserved
Bar Chart
A graphical summary for categorical data Each category is represented by a bar. The height of each bar is proportional to the
frequency for that category. There can be more than one choice of
ordering the categories.
2 - 27 Copyright © 2014 Pearson Education, Inc. All rights reserved
Bar Chart for Army Ranks
2 - 28 Copyright © 2014 Pearson Education, Inc. All rights reserved
Pareto Chart
A Pareto Chart is a bar chart that orders the categories from largest to smallest frequency.
2 - 29 Copyright © 2014 Pearson Education, Inc. All rights reserved
Differences Between Bar Charts and Histograms
A histogram displays numerical data. A bar chart can display categorical data.
The bar widths of a histogram are meaningful and must all be the same size. The bar widths for a bar chart are meaningless.
The bars of a histogram must touch each other. For a bar chart, there are gaps between bars.
There is only one choice, ascending by x, for the order of the bars, while there are many choices of order for a bar chart.
2 - 30 Copyright © 2014 Pearson Education, Inc. All rights reserved
Pie Charts
Graphical summary for categorical data. A circle is cut into several slices. The size of each
slice is proportional to the frequency of the category that it represents.
Often used to display how much of a share each category has of the whole.
If f is the frequency and n is the sample size, the angle of each slice is
Angle 360f
n
2 - 31 Copyright © 2014 Pearson Education, Inc. All rights reserved
Pie Chart of Army Ranks
Copyright © 2014 Pearson Education, Inc. All rights reserved
2.4
Summarizing Categorical
Distributions
2 - 33 Copyright © 2014 Pearson Education, Inc. All rights reserved
Description of Numerical Distributions vs. Categorical Distributions
Numerical Distributions Shape Center Spread
Categorical Distributions Mode Variability or Diversity
2 - 34 Copyright © 2014 Pearson Education, Inc. All rights reserved
Example of a Bar Chart with Pop as the Mode
2 - 35 Copyright © 2014 Pearson Education, Inc. All rights reserved
Mode
The Mode is the category that occurs with the highest frequency.
The mode is thought of as the typical outcome. If there is a close tie between two categories for
most frequently occurring, the distribution is called bimodal.
If more than two categories have roughly the tallest bars, the distribution is called multimodal.
2 - 36 Copyright © 2014 Pearson Education, Inc. All rights reserved
Bimodal Distribution
2 - 37 Copyright © 2014 Pearson Education, Inc. All rights reserved
Multimodal Bar Chart
2 - 38 Copyright © 2014 Pearson Education, Inc. All rights reserved
Variability
If the distribution has a lot of diversity (many observations in many different categories), then variability is high.
If the distribution has only a little diversity (many of the observations fall into the same category), then variability is low.
Caution: Variability is about many different categories, not many frequencies.
2 - 39 Copyright © 2014 Pearson Education, Inc. All rights reserved
High Variability
2 - 40 Copyright © 2014 Pearson Education, Inc. All rights reserved
Low Variability
2 - 41 Copyright © 2014 Pearson Education, Inc. All rights reserved
Side-by-Side Bar Chart
Copyright © 2014 Pearson Education, Inc. All rights reserved
2.5
Interpreting Graphs
2 - 43 Copyright © 2014 Pearson Education, Inc. All rights reserved
Ways to Mislead with Graphs: Don’t Do Any of These!
Have the frequency scale not begin at 0 to create the illusion of greater differences.
Use symbols other than bars that hide or accentuate the real differences.
Use unequal width bars.
2 - 44 Copyright © 2014 Pearson Education, Inc. All rights reserved
Scale Not Starting at 0
The left bar chart misleads by making the differences seem greater than they are.
2 - 45 Copyright © 2014 Pearson Education, Inc. All rights reserved
Scale Unclear
This is misleading because we cannot see the frequencies.
2005 2006 2007 2008
Homes Sold by Year
2 - 46 Copyright © 2014 Pearson Education, Inc. All rights reserved
Scale Unclear
The scale is by area and not by just height.
2005 2006 2007 2008
Homes Sold by Year
2 - 47 Copyright © 2014 Pearson Education, Inc. All rights reserved
Other Creative Charting Techniques
Internet and computers allow for additional effects Analysis of State of the Union Speeches World Population Changes
Copyright © 2014 Pearson Education, Inc. All rights reserved
Chapter 2
Case Study
2 - 49 Copyright © 2014 Pearson Education, Inc. All rights reserved
Class Sizes: Private vs. Public
Using raw data is ineffective for this comparison
2 - 50 Copyright © 2014 Pearson Education, Inc. All rights reserved
Private Colleges’ Student-to-Teacher Ratio
Typical ratio between 10 and 11. Skewed right. Outlier of 54 student-to-teacher ratio. Large Variation.
2 - 51 Copyright © 2014 Pearson Education, Inc. All rights reserved
Public Colleges’ Student-to-Teacher Ratio
Typical ratio between 16 and 20. Generally symmetric. Outlier of fractional student-to-teacher ratio. Less Variation.
2 - 52 Copyright © 2014 Pearson Education, Inc. All rights reserved
Comparing the Histograms
It is much easier to describe the data when they are displayed using histograms compared to just the raw data table.
Copyright © 2014 Pearson Education, Inc. All rights reserved
Chapter 2
Guided Exercise
2 - 54 Copyright © 2014 Pearson Education, Inc. All rights reserved
Eating Out for Students With Full Time Jobs vs. Part Time Jobs
Full time jobs: 5, 3, 4, 4, 4, 2, 1, 5, 6, 5, 6, 3, 3, 2, 4, 5, 2, 3, 7, 5, 5, 1,4, 6, 7
Part time jobs: 1, 1, 5, 1, 4, 2, 2, 3, 3, 2, 3, 2, 4, 2, 1, 2, 3, 2, 1, 3, 3, 2,4, 2, 1
2 - 55 Copyright © 2014 Pearson Education, Inc. All rights reserved
Create a Dot Plot
Full time jobs: 5, 3, 4, 4, 4, 2, 1, 5, 6, 5, 6, 3, 3, 2, 4, 5, 2, 3, 7, 5, 5, 1,4, 6, 7
Part time jobs: 1, 1, 5, 1, 4, 2, 2, 3, 3, 2, 3, 2, 4, 2, 1, 2, 3, 2, 1, 3, 3, 2,4, 2, 1
2 - 56 Copyright © 2014 Pearson Education, Inc. All rights reserved
Examine Shapes
Full time jobs: Relatively mound shaped. Part time jobs: Slightly skewed right.
2 - 57 Copyright © 2014 Pearson Education, Inc. All rights reserved
Examine Center
Full time jobs: Typically eat out 5 times per week Part time jobs: Typically eat out 2 times per week
2 - 58 Copyright © 2014 Pearson Education, Inc. All rights reserved
Examine Variation
Full time jobs: Larger Variation - from once to 7 times per week. Part time jobs: Smaller variation – from once to 5 times per week.
2 - 59 Copyright © 2014 Pearson Education, Inc. All rights reserved
Check for Outliers
Full time jobs: No gaps, so no clear outliers Part time jobs: No gaps, so no clear outliers.
2 - 60 Copyright © 2014 Pearson Education, Inc. All rights reserved
Summarize
The typical part time worker eats out less often compared to the typical full time worker. There is wider variation for the eating out by full time workers than by part time workers. The shape of the distribution for full time workers is approximately mound shaped, while it is slightly skewed right for part time workers.