Exploring Data - Moore Public Schools...Exploring Data Chapter 1 Patterns from Histogram A Center:...

Preview:

Citation preview

Exploring Data

Chapter 1

Patterns from Histogram A Center: the value that divides the observations roughly in half

Spread (variability): the extent of the data from smallest to largest value

Histogram A example

center: 35, spread: 25 to 45

0 10 20 30 40 50 60 70 80 90 100

Histogram A practice

center, spread

0 10 20 30 40 50 60 70 80 90 100

Patterns from Histogram B Center: the value that divides the observations roughly in half

Spread (variability): the extent of the data from smallest to largest value

Shape: overall appearance of distribution

Histogram B example

skewed right

0 10 20 30 40 50 60 70 80 90 100

Histogram B example

skewed left

0 10 20 30 40 50 60 70 80 90 100

Histogram B example

symmetrical, mound shaped

0 10 20 30 40 50 60 70 80 90 100

Histogram B example

uniform, spread from 55 to 80, center around 70

0 10 20 30 40 50 60 70 80 90 100

Histogram B example

bimodal

0 10 20 30 40 50 60 70 80 90 100

Patterns from Histogram C Center: the value that divides the observations roughly in half

Spread (variability): the extent of the data from smallest to largest value

Shape: overall appearance of distribution

Unusual features: gaps/clusters and outliers

Histogram C example

roughly symmetrical with gaps at 30 and 40, center at 35, spread from 20 to 55

0 10 20 30 40 50 60 70 80 90 100

Histogram C example

uniform with possible outlier at 5, center around 43, spread from 5 to 55

0 10 20 30 40 50 60 70 80 90 100

Displaying Distributions with Graphs

categorical versus quantitative

categorical: bar graphs, pie charts

quantitative: dotplots, histograms, stemplots, boxplots

Typing Speeds Stemplot

8

2

6

7

5

4

3

2

8

9 1

7

2

5

3

5

5

5

4

5

8 5

1 9 6

2

6

1

Fairly symmetrical

Median: 62

Spread: from 22

to 91

No unusual

features

Key: 2|2 means 22 wpm

Alfred Hitchcock Stemplot

13

12

11

10

9

8

9

5

0 6 2

0 0 0 6 8

6 6 3 1 7

8 3 8 8 1 3

1

Key: 8|1 means 81 minutes

Slightly skewed

Median: 116

Spread: from 81

to 136

Gap in 90s

Split Stemplot

3

1

4

2

0 1 2 3 0 0 0 0 1 1 1 2 3 3 4 4 4 4 6 7 8 8 4 6 6 6 7 7 8 8 8 9 9 9

Similar to a histogram, we want to avoid too many

data points in a small range

ages of which a sample of 35 American mothers first gave birth

Key: 1|4 means 14 years old

Split Stemplot

Key: 1|4 means 14 years old

Split stemplot typically breaks each stem into

High (5-9) and Low(0-4)

3H

2H

1H

4L

3L

2L

1L

0 1 2 3 0 0 0 0 1 1 1 2 3 3 4 4 4 4

6 6 6 7 7 8 8 8 9 9 9

6 7 8 8

4

Back to Back Stemplots

6

5

4

3

2

1

0

4 9 4

1 6 7 6 9 6 1

0

5 4

5 2

3

3 6

6

9 3

8

8

4

1

Key: 4 | 1 means 41

Babe Ruth vs. Roger Maris Generally, we can see that Babe Ruth hit more home runs than Roger Maris.

The center of Babe Ruth is higher at 46 than Roger Maris at 24.5 home runs.

Roger Maris has a possible outlier at 61 yet Ruth has no outliers.

Maris has a larger spread from 8 to 61, but Ruth has a higher spread from 22 to 60; especially if we exclude the possible outlier.

Both distributions are fairly symmetrical.

Babe Ruth vs. Roger Maris Generally, we can see that Babe Ruth hit more home runs than Roger Maris.

The center of Babe Ruth is higher at 46 than Roger Maris at 24.5 home runs.

Roger Maris has a possible outlier at 61 yet Ruth has no outliers.

Maris has a larger spread from 8 to 61, but Ruth has a higher spread from 22 to 60; especially if we exclude the possible outlier.

Both distributions are fairly symmetrical.

Recommended