95
Chapter 4 Displaying and Summarizing Quantitative Data CHAPTER OBJECTIVES At the conclusion of this chapter you should be able to: 1) Construct graphs that appropriately describe quantitative data 2) Calculate and interpret numerical summaries of quantitative data. 3) Combine numerical methods with graphical methods to analyze a data set. 4) Apply graphical methods of summarizing data to choose appropriate numerical summaries. 5) Apply software and/or calculators to automate graphical and numerical summary procedures.

Chapter 4 Displaying and Summarizing Quantitative Data

  • Upload
    benny

  • View
    44

  • Download
    3

Embed Size (px)

DESCRIPTION

Chapter 4 Displaying and Summarizing Quantitative Data. CHAPTER OBJECTIVES At the conclusion of this chapter you should be able to: 1)Construct graphs that appropriately describe quantitative data 2)Calculate and interpret numerical summaries of quantitative data. - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 4 Displaying and Summarizing Quantitative Data

Chapter 4Displaying and Summarizing

Quantitative DataCHAPTER OBJECTIVES

At the conclusion of this chapter you should be able to: 1) Construct graphs that appropriately describe

quantitative data 2) Calculate and interpret numerical summaries

of quantitative data. 3) Combine numerical methods with graphical

methods to analyze a data set. 4) Apply graphical methods of summarizing data

to choose appropriate numerical summaries. 5) Apply software and/or calculators to automate

graphical and numerical summary procedures.

Page 2: Chapter 4 Displaying and Summarizing Quantitative Data

Displaying Quantitative Data

Histograms

Stem and Leaf Displays

Page 3: Chapter 4 Displaying and Summarizing Quantitative Data

Relative Frequency Histogram of Exam Grades

0.05

.10

.15

.20

.25

.30

40 50 60 70 80 90Grade

Rel

ativ

e fr

eque

ncy

100

Page 4: Chapter 4 Displaying and Summarizing Quantitative Data

Frequency Histogram

Page 5: Chapter 4 Displaying and Summarizing Quantitative Data

Histograms

A histogram shows three general types of information:

It provides visual indication of where the approximate center of the data is.

We can gain an understanding of the degree of spread, or variation, in the data.

We can observe the shape of the distribution.

Page 6: Chapter 4 Displaying and Summarizing Quantitative Data

All 200 m Races 20.2 secs or less

Page 7: Chapter 4 Displaying and Summarizing Quantitative Data

Histograms Showing Different Centers

Page 8: Chapter 4 Displaying and Summarizing Quantitative Data

Histograms Showing DifferentCenters

(football head coach

salaries)

Page 9: Chapter 4 Displaying and Summarizing Quantitative Data

Histograms - Same Center,

Different Spread

(football head coach

salaries)

Page 10: Chapter 4 Displaying and Summarizing Quantitative Data

Excel Example: 2012-13 NFL Salaries

Page 11: Chapter 4 Displaying and Summarizing Quantitative Data

Statcrunch Example: 2012-13 NFL Salaries

Page 12: Chapter 4 Displaying and Summarizing Quantitative Data

Grades on a statistics exam

Data:

75 66 77 66 64 73 91 65 59 86 61 86 61

58 70 77 80 58 94 78 62 79 83 54 52 45

82 48 67 55

Page 13: Chapter 4 Displaying and Summarizing Quantitative Data

Frequency Distribution of Grades

Class Limits Frequency40 up to 50

50 up to 60

60 up to 70

70 up to 80

80 up to 90

90 up to 100

Total

2

6

8

7

5

2

30

Page 14: Chapter 4 Displaying and Summarizing Quantitative Data

Relative Frequency Distribution of Grades

Class Limits Relative Frequency40 up to 50

50 up to 60

60 up to 70

70 up to 80

80 up to 90

90 up to 100

2/30 = .067

6/30 = .200

8/30 = .267

7/30 = .233

5/30 = .167

2/30 = .067

Page 15: Chapter 4 Displaying and Summarizing Quantitative Data

Relative Frequency Histogram of Grades

0.05

.10

.15

.20

.25

.30

40 50 60 70 80 90Grade

Rel

ativ

e fr

eque

ncy

100

Page 16: Chapter 4 Displaying and Summarizing Quantitative Data

Based on the histo-gram, about what percent of the values are between 47.5 and 52.5?

1 2 3 4

0% 0%0%0%

1. 50%

2. 5%

3. 17%

4. 30%

CountdownCountdown

10

Page 17: Chapter 4 Displaying and Summarizing Quantitative Data

Stem and leaf displays Have the following general appearance

stem leaf

1 8 9

2 1 2 8 9 9

3 2 3 8 9

4 0 1

5 6 7

6 4

Page 18: Chapter 4 Displaying and Summarizing Quantitative Data

Stem and Leaf Displays Partition each no. in data into a “stem” and

“leaf” Constructing stem and leaf display

1) deter. stem and leaf partition (5-20 stems)

2) write stems in column with smallest stem at top; include all stems in range of data

3) only 1 digit in leaves; drop digits or round off

4) record leaf for each no. in corresponding stem row; ordering the leaves in each row helps

Page 19: Chapter 4 Displaying and Summarizing Quantitative Data

Example: employee ages at a small company

18 21 22 19 32 33 40 41 56 57 64 28 29 29 38 39; stem: 10’s digit; leaf: 1’s digit

18: stem=1; leaf=8; 18 = 1 | 8

stem leaf

1 8 9

2 1 2 8 9 9

3 2 3 8 9

4 0 1

5 6 7

6 4

Page 20: Chapter 4 Displaying and Summarizing Quantitative Data

Suppose a 95 yr. old is hiredstem leaf

1 8 9

2 1 2 8 9 9

3 2 3 8 9

4 0 1

5 6 7

6 4

7

8

9 5

Page 21: Chapter 4 Displaying and Summarizing Quantitative Data

Number of TD passes by NFL teams: 2012-2013 season(stems are 10’s digit)

stem leaf

4

3

03

247

2 6677789

2 01222233444

1 13467889

0 8

Page 22: Chapter 4 Displaying and Summarizing Quantitative Data

Pulse Rates n = 138

# Stem Leaves 4* 3 4. 588 9 5* 001233444 10 5. 5556788899 23 6* 00011111122233333344444 23 6. 55556666667777788888888 16 7* 00000112222334444 23 7. 55555666666777888888999 10 8* 0000112224 10 8. 5555667789 4 9* 0012 2 9. 58 4 10* 0223 10. 1 11* 1

Page 23: Chapter 4 Displaying and Summarizing Quantitative Data

Advantages/Disadvantages of Stem-and-Leaf Displays

Advantages

1) each measurement displayed

2) ascending order in each stem row

3) relatively simple (data set not too large) Disadvantages

display becomes unwieldy for large data sets

Page 24: Chapter 4 Displaying and Summarizing Quantitative Data

Population of 185 US cities with between 100,000 and 500,000

Multiply stems by 100,000

Page 25: Chapter 4 Displaying and Summarizing Quantitative Data

Back-to-back stem-and-leaf displays. TD passes by NFL teams: 1999-2000, 2012-13multiply stems by 10

1999-2000 2012-13

2 4 03

6 3 7

2 3 24

6655 2 6677789

43322221100 2 01222233444

9998887666 1 67889

421 1 134

0 8

Page 26: Chapter 4 Displaying and Summarizing Quantitative Data

Below is a stem-and-leaf display for the pulse rates of 24 women at a health clinic. How many pulses are between 67 and 77?

Stems are 10’s digits

1 2 3 4 5

0% 0% 0%0%0%

1. 4

2. 6

3. 8

4. 10

5. 12 CountdownCountdown

10

Page 27: Chapter 4 Displaying and Summarizing Quantitative Data

Interpreting Graphical Displays: Shape

A distribution is symmetric if the right and left

sides of the histogram are approximately mirror

images of each other.

Symmetric distribution

Complex, multimodal distribution

Not all distributions have a simple overall shape,

especially when there are few observations.

Skewed distribution

A distribution is skewed to the right if the right

side of the histogram (side with larger values)

extends much farther out than the left side. It is

skewed to the left if the left side of the histogram

extends much farther out than the right side.

Page 28: Chapter 4 Displaying and Summarizing Quantitative Data

Heights of Students in Recent Stats Class

Page 29: Chapter 4 Displaying and Summarizing Quantitative Data

Shape (cont.)Female heart attack patients in New York state

Age: left-skewed Cost: right-skewed

Page 30: Chapter 4 Displaying and Summarizing Quantitative Data

Alaska Florida

Shape (cont.): Outliers

An important kind of deviation is an outlier. Outliers are observations

that lie outside the overall pattern of a distribution. Always look for

outliers and try to explain them.

The overall pattern is fairly

symmetrical except for 2

states clearly not belonging

to the main trend. Alaska

and Florida have unusual

representation of the

elderly in their population.

A large gap in the

distribution is typically a

sign of an outlier.

Page 31: Chapter 4 Displaying and Summarizing Quantitative Data

Center: typical value of frozen personal pizza? ~$2.65

Page 32: Chapter 4 Displaying and Summarizing Quantitative Data

Spread: fuel efficiency 4, 8 cylinders

4 cylinders: more spread 8 cylinders: less spread

Page 33: Chapter 4 Displaying and Summarizing Quantitative Data

Other Graphical Methods for Economic Data

Time plots

plot observations in time order, with time on the horizontal axis and the vari-able on the vertical axis

** Time series

measurements are taken at regular intervals (monthly unemployment, quarterly GDP, weather records, electricity demand, etc.)

Page 34: Chapter 4 Displaying and Summarizing Quantitative Data

Unemployment Rate, by Educational Attainment

Page 35: Chapter 4 Displaying and Summarizing Quantitative Data

Water Use During Super Bowl

Page 36: Chapter 4 Displaying and Summarizing Quantitative Data

Winning Times 100 M Dash

Page 37: Chapter 4 Displaying and Summarizing Quantitative Data

Numerical Summaries of Quantitative Data

Numerical and More Graphical Methods to Describe Univariate

Data

Page 38: Chapter 4 Displaying and Summarizing Quantitative Data

2 characteristics of a data set to measure

center

measures where the “middle” of the data is located

variability

measures how “spread out” the data is

Page 39: Chapter 4 Displaying and Summarizing Quantitative Data

The median: a measure of center

Given a set of n measurements arranged in order of magnitude,

Median= middle value n odd

mean of 2 middle values, n even

Ex. 2, 4, 6, 8, 10; n=5; median=6 Ex. 2, 4, 6, 8; n=4; median=(4+6)/2=5

Page 40: Chapter 4 Displaying and Summarizing Quantitative Data

Student Pulse Rates (n=62)

38, 59, 60, 60, 62, 62, 63, 63, 64, 64, 65, 67, 68, 70, 70, 70, 70, 70, 70, 70, 71, 71, 72, 72, 73, 74, 74, 75, 75, 75, 75, 76, 77, 77, 77, 77, 78, 78, 79, 79, 80, 80, 80, 84, 84, 85, 85, 87, 90, 90, 91, 92, 93, 94, 94, 95, 96, 96, 96, 98, 98, 103

Median = (75+76)/2 = 75.5

Page 41: Chapter 4 Displaying and Summarizing Quantitative Data

Medians are used often

Year 2011 baseball salaries

Median $1,450,000 (max=$32,000,000 Alex Rodriguez; min=$414,000)

Median fan age: MLB 45; NFL 43; NBA 41; NHL 39

Median existing home sales price: May 2011 $166,500; May 2010 $174,600

Median household income (2008 dollars) 2009 $50,221; 2008 $52,029

Page 42: Chapter 4 Displaying and Summarizing Quantitative Data

The median splits the histogram into 2 halves of equal area

Page 43: Chapter 4 Displaying and Summarizing Quantitative Data

Examples Example: n = 7

17.5 2.8 3.2 13.9 14.1 25.3 45.8 Example n = 7 (ordered): 2.8 3.2 13.9 14.1 17.5 25.3 45.8 Example: n = 8

17.5 2.8 3.2 13.9 14.1 25.3 35.7 45.8 Example n =8 (ordered)

2.8 3.2 13.9 14.1 17.5 25.3 35.7 45.8

m = 14.1

m = (14.1+17.5)/2 = 15.8

Page 44: Chapter 4 Displaying and Summarizing Quantitative Data

Below are the annual tuition charges at 7 public universities. What is the median

tuition?

1 2. 3 4

0% 0%0%0%

4429496049604971524555467586

1. 5245

2. 4965.5

3. 4960

4. 4971

CountdownCountdown

10

Page 45: Chapter 4 Displaying and Summarizing Quantitative Data

Below are the annual tuition charges at 7 public universities. What is the median

tuition?

1 2. 3 4

0% 0%0%0%

4429496052455546497155877586

1. 5245

2. 4965.5

3. 5546

4. 4971

CountdownCountdown

10

Page 46: Chapter 4 Displaying and Summarizing Quantitative Data

Measures of Spread

The range and interquartile range

Page 47: Chapter 4 Displaying and Summarizing Quantitative Data

Ways to measure variability

range=largest-smallest OK sometimes; in general, too crude;

sensitive to one large or small data value

The range measures spread by examining the ends of the data

A better way to measure spread is to examine the middle portion of the data

Page 48: Chapter 4 Displaying and Summarizing Quantitative Data

m = median = 3.4

Q1= first quartile = 2.3

Q3= third quartile = 4.2

1 1 0.62 2 1.23 3 1.64 4 1.95 5 1.56 6 2.17 7 2.38 6 2.39 5 2.510 4 2.811 3 2.912 2 3.313 1 3.414 2 3.615 3 3.716 4 3.817 5 3.918 6 4.119 7 4.220 6 4.521 5 4.722 4 4.923 3 5.324 2 5.625 1 6.1

Quartiles: Measuring spread by examining the middle

The first quartile, Q1, is the value in the

sample that has 25% of the data at or

below it (Q1 is the median of the lower

half of the sorted data).

The third quartile, Q3, is the value in the

sample that has 75% of the data at or

below it (Q3 is the median of the upper

half of the sorted data).

Page 49: Chapter 4 Displaying and Summarizing Quantitative Data

Quartiles and median divide data into 4 pieces

Q1 M Q3Q1 M Q3

1/41/4 1/41/4 1/41/4 1/41/4

Page 50: Chapter 4 Displaying and Summarizing Quantitative Data

Quartiles are common measures of spread

http://www2.acs.ncsu.edu/UPA/admissions/fresprof.htm

http://www2.acs.ncsu.edu/UPA/peers/current/ncsu_peers/sat.htm

University of Southern California

Page 51: Chapter 4 Displaying and Summarizing Quantitative Data

Rules for Calculating QuartilesStep 1: find the median of all the data (the median divides the data in half)

Step 2a: find the median of the lower half; this median is Q1;Step 2b: find the median of the upper half; this median is Q3.

Important:when n is odd include the overall median in both halves;when n is even do not include the overall median in either half.

Page 52: Chapter 4 Displaying and Summarizing Quantitative Data

Example 2 4 6 8 10 12 14 16 18 20 n = 10

Median m = (10+12)/2 = 22/2 = 11

Q1 : median of lower half 2 4 6 8 10

Q1 = 6

Q3 : median of upper half 12 14 16 18 20

Q3 = 16

11

Page 53: Chapter 4 Displaying and Summarizing Quantitative Data

Quartile example: odd no. of data values

HR’s hit by Babe Ruth in each season as a Yankee 54 59 35 41 46 25 47 60 54 46 49 46 41 34 22

Ordered values: 22 25 34 35 41 41 46 46 46 47 49 54 54 59 60

Median: value in ordered position 8. median = 46

Lower half (including overall median): 22 25 34 35 41 41 46 46

Upper half (including overall median): 46 46 47 49 54 54 59 60

1

35 41lower quartile 38

2Q

3

49 54upper quartile 51.5

2Q

Page 54: Chapter 4 Displaying and Summarizing Quantitative Data

Pulse Rates n = 138

# Stem Leaves4*

3 4. 5889 5* 00123344410 5. 555678889923 6* 0001111112223333334444423 6. 5555666666777778888888816 7* 0000011222233444423 7. 5555566666677788888899910 8* 000011222410 8. 55556677894 9* 00122 9. 584 10* 0223

10.1 11* 1

Median: mean of pulses in locations 69 & 70: median= (70+70)/2=70

Q1: median of lower half (lower half = 69 smallest pulses); Q1 = pulse in ordered position 35;Q1 = 63

Q3 median of upper half (upper half = 69 largest pulses); Q3= pulse in position 35 from the high end; Q3=78

Page 55: Chapter 4 Displaying and Summarizing Quantitative Data

Below are the weights of 31 linemen on the NCSU football team. What is the

value of the first quartile Q1?

# stemleaf

2 2255

4 2357

6 2426

7 257

10 26257

12 2759

(4) 281567

15 2935599

10 30333

7 3145

5 32155

2 336

1 340

1 2. 3. 4.

0% 0%0%0%

1. 287

2. 257.5

3. 263.5

4. 262.5

CountdownCountdown

10

Page 56: Chapter 4 Displaying and Summarizing Quantitative Data

Interquartile range

lower quartile Q1

middle quartile: median upper quartile Q3

interquartile range (IQR)IQR = Q3 – Q1

measures spread of middle 50% of the data

Page 57: Chapter 4 Displaying and Summarizing Quantitative Data

Example: beginning pulse rates

Q3 = 78; Q1 = 63

IQR = 78 – 63 = 15

Page 58: Chapter 4 Displaying and Summarizing Quantitative Data

Below are the weights of 31 linemen on the NCSU football team. The first quartile Q1 is 263.5. What is the value of the IQR?

# stemleaf

2 2255

4 2357

6 2426

7 257

10 26257

12 2759

(4) 281567

15 2935599

10 30333

7 3145

5 32155

2 336

1 340

1. 2. 3 4.

0% 0%0%0%

1. 23.5

2. 39.5

3. 46

4. 69.5

CountdownCountdown

10

Page 59: Chapter 4 Displaying and Summarizing Quantitative Data

5-number summary of data

Minimum Q1 median Q3 maximum

Pulse data

45 63 70 78 111

Page 60: Chapter 4 Displaying and Summarizing Quantitative Data

End of General Numerical SummariesNext: Numerical Summaries of

Symmetric Data

Page 61: Chapter 4 Displaying and Summarizing Quantitative Data

Numerical Summaries of Symmetric Data.

Measure of Center: Mean

Measure of Variability: Standard Deviation

Page 62: Chapter 4 Displaying and Summarizing Quantitative Data

Symmetric DataBody temp. of 93 adults

Page 63: Chapter 4 Displaying and Summarizing Quantitative Data

Recall: 2 characteristics of a data set to measure

center

measures where the “middle” of the data is located

variability

measures how “spread out” the data is

Page 64: Chapter 4 Displaying and Summarizing Quantitative Data

Measure of Center When Data Approx. Symmetric

mean (arithmetic mean) notationx i

x x x x

n

x x x x x

i

n

ii

n

n

: th measurement in a set of observations

number of measurements in data set; sample

size

1 2 3

11 2 3

, , , ,

:

Page 65: Chapter 4 Displaying and Summarizing Quantitative Data

N

x

n

x

n

xxxxx

x

N

ii

n

ii

n

1

1321

size population = N

known)not typically(value mean Population

mean Sample

Page 66: Chapter 4 Displaying and Summarizing Quantitative Data

Connection Between Mean and Histogram

A histogram balances when supported at the mean. Mean x = 140.6

Histogram

0

10

20

30

40

50

60

70

118.

5

125.

5

132.

5

139

.5

146.

5

153.

5

160

.5

Mor

e

Absences from Work

Fre

qu

en

cy

Frequency

Page 67: Chapter 4 Displaying and Summarizing Quantitative Data

Mean: balance pointMedian: 50% area each half

right histo: mean 55.26 yrs, median 57.7yrs

Page 68: Chapter 4 Displaying and Summarizing Quantitative Data

Properties of Mean, Median1.The mean and median are unique; that is, a

data set has only 1 mean and 1 median (the mean and median are not necessarily equal).

2.The mean uses the value of every number in the data set; the median does not.

14

20 4 6Ex. 2, 4, 6, 8. 5; 5

4 2

21 4 6Ex. 2, 4, 6, 9. 5 ; 5

4 2

x m

x m

Page 69: Chapter 4 Displaying and Summarizing Quantitative Data

Example: class pulse rates

53 64 67 67 70 76 77 77 78 83 84 85 85 89 90 90 90 90 91 96 98 103 140

23

1

23

84.48;23

:location: 12th obs. 85

ii

n

xx

m m

Page 70: Chapter 4 Displaying and Summarizing Quantitative Data

2010, 2014 baseball salaries

2010

n = 845

= $3,297,828

median = $1,330,000

max = $33,000,000

2014

n = 848

= $3,932,912

median = $1,456,250

max = $28,000,000

Page 71: Chapter 4 Displaying and Summarizing Quantitative Data

Disadvantage of the mean

Can be greatly influenced by just a few observations that are much greater or much smaller than the rest of the data

Page 72: Chapter 4 Displaying and Summarizing Quantitative Data

Mean, Median, Maximum Baseball Salaries 1985 - 2014

Page 73: Chapter 4 Displaying and Summarizing Quantitative Data

Skewness: comparing the mean, and median

Skewed to the right (positively skewed) mean>median

53

490

102 7235 21 26 17 8 10 2 3 1 0 0 1

0

100

200

300

400

500

600

Freq

uenc

y

Salary ($1,000's)

2011 Baseball Salaries

Page 74: Chapter 4 Displaying and Summarizing Quantitative Data

Skewed to the left; negatively skewed

Mean < median mean=78; median=87;

Histogram of Exam Scores

0

10

20

30

20 30 40 50 60 70 80 90 100Exam Scores

Fre

qu

en

cy

Page 75: Chapter 4 Displaying and Summarizing Quantitative Data

Symmetric data

mean, median approx. equal

Bank Customers: 10:00-11:00 am

0

5

10

15

20

Number of Customers

Fre

qu

en

cy

Page 76: Chapter 4 Displaying and Summarizing Quantitative Data

DESCRIBING VARIABILITY OF SYMMETRIC DATA

Page 77: Chapter 4 Displaying and Summarizing Quantitative Data

Describing Symmetric Data (cont.)

Measure of center for symmetric data:

Measure of variability for symmetric data?

1 2 3 1

Sample mean n

in i

x

xx x x x

xn n

Page 78: Chapter 4 Displaying and Summarizing Quantitative Data

Example

2 data sets:

x1=49, x2=51 x=50

y1=0, y2=100 y=50

Page 79: Chapter 4 Displaying and Summarizing Quantitative Data

On average, they’re both comfortable

0 10049 51

Page 80: Chapter 4 Displaying and Summarizing Quantitative Data

Ways to measure variability

1. range=largest-smallest

ok sometimes; in general, too crude; sensitive to one large or small obs.

1

2. measure spread from the middle, where

the middle is the mean ;

deviation of from the mean:

( ); sum the deviations of all the 's from ;

i i

n

i ii

x

x x x

x x x x

1

( ) 0 always; tells us nothingn

ii

x x

Page 81: Chapter 4 Displaying and Summarizing Quantitative Data

Previous Example

1 2

1 2

1 2

1 2

sum of deviations from mean:

49, 51; 50

( ) ( ) (49 50) (51 50) 1 1 0;

0, 100; 50

( ) ( ) (0 50) (100 50) 50 50 0

x x x

x x x x

y y y

y y y y

Page 82: Chapter 4 Displaying and Summarizing Quantitative Data

The Sample Standard Deviation, a measure of spread around the mean Square the deviation of each

observation from the mean; find the square root of the “average” of these squared deviations

deviation

standard sample thecalled1

)(

average theofroot square thethen take

,average"" thefind and)(;)(

1

2

1

22

n

xxs

xxxx

n

ii

n

iii

Page 83: Chapter 4 Displaying and Summarizing Quantitative Data

Calculations …

Mean = 63.4

Sum of squared deviations from

mean = 85.2

(n − 1) = 13; (n − 1) is called degrees

freedom (df)

s2 = variance = 85.2/13 = 6.55

inches squared

s = standard deviation = √6.55 = 2.56

inches

Women height (inches)

x

Page 84: Chapter 4 Displaying and Summarizing Quantitative Data

x

2

1

2 )(1

1xx

ns

n

i

1. First calculate the variance s2.2. Then take the square root to get the

standard deviation s.

2

1

)(1

1xx

ns

n

i

Mean± 1 s.d.

We’ll never calculate these by hand, so make sure to know how to get the standard deviation using your calculator, Excel, or other software.

Page 85: Chapter 4 Displaying and Summarizing Quantitative Data

Population Standard Deviation

2

1

( )population standard deviation

value of typically not known;

use to estimate value of

N

ii

x

N

s

Page 86: Chapter 4 Displaying and Summarizing Quantitative Data

Remarks

1. The standard deviation of a set of measurements is an estimate of the likely size of the chance error in a single measurement

Page 87: Chapter 4 Displaying and Summarizing Quantitative Data

Remarks (cont.)

2. Note that s and are always greater than or equal to zero.

3. The larger the value of s (or ), the greater the spread of the data.

When does s=0? When does =0?When all data values are the same.

Page 88: Chapter 4 Displaying and Summarizing Quantitative Data

Remarks (cont.)4. The standard deviation is the most

commonly used measure of risk in finance and business– Stocks, Mutual Funds, etc.

5. Variance s2 sample variance 2 population variance Units are squared units of the original data square $, square gallons ??

Page 89: Chapter 4 Displaying and Summarizing Quantitative Data

Remarks 6):Why divide by n-1 instead of n?

degrees of freedom each observation has 1 degree of

freedom however, when estimate unknown

population parameter like , you lose 1 degree of freedom

1

)(; of value

unkown theestimate to use we,for formulaIn

1

2

n

xxs

xs

n

ii

Page 90: Chapter 4 Displaying and Summarizing Quantitative Data

Remarks 6) (cont.):Why divide by n-1 instead of n? Example

Suppose we have 3 numbers whose average is 9

x1= x2=

then x3 must be

once we selected x1 and x2, x3 was determined since the average was 9

3 numbers but only 2 “degrees of freedom”

Since the average (mean) is 9, x1 + x2 + x3 must equal 9*3 = 27, so x3 = 27 – (x1 + x2)

Choose ANY values for x1 and x2

Page 91: Chapter 4 Displaying and Summarizing Quantitative Data

Computational Example

67.11

;42.367.113

35

3

25.2025.25.225.12

3

)5.4()5(.)5.1()5.3(

14

)5.49()5.45()5.43()5.41(

5.4;9,5,3,1

2

2222

2222

418

s

s

xnsobservatio

Page 92: Chapter 4 Displaying and Summarizing Quantitative Data

class pulse rates

2 2

53 64 67 67 70 76 77 77 78 83 84 85 85 89 90

90 90 90 91 96 98 103 140

23 84.48 85

290.26(beats per minute)

17.037 beats per minute

n x m

s

s

Page 93: Chapter 4 Displaying and Summarizing Quantitative Data

Review: Properties of s and s and are always greater than or

equal to 0

when does s = 0? = 0? The larger the value of s (or ), the

greater the spread of the data the standard deviation of a set of

measurements is an estimate of the likely size of the chance error in a single measurement

Page 94: Chapter 4 Displaying and Summarizing Quantitative Data

Summary of Notation

2

SAMPLE

sample mean

sample median

sample variance

sample stand. dev.

y

m

s

s

2

POPULATION

population mean

population median

population variance

population stand. dev.

m

Page 95: Chapter 4 Displaying and Summarizing Quantitative Data

End of Chapter 4