21
Describing Distributions Describing Distributions with Numbers with Numbers Section 1.2

1.2 Power Point

Embed Size (px)

Citation preview

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 1/21

Describing DistributionsDescribing Distributionswith Numberswith Numbers

Section 1.2

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 2/21

What Does That Mean?What Does That Mean?

We are about to learn specific ways to calculateWe are about to learn specific ways to calculate

center and spread of a distribution. You cancenter and spread of a distribution. You can

calculate these numerical values for anycalculate these numerical values for any

quantitative variable. But to interpret thesequantitative variable. But to interpret these

measures of center and spread, and to choosemeasures of center and spread, and to choose

among the several methods you will learn, youamong the several methods you will learn, you

must think about the shape of the distributionmust think about the shape of the distributionand the meaning of the data. The numbers, likeand the meaning of the data. The numbers, like

the graphs, are aids to understanding, not ³thethe graphs, are aids to understanding, not ³the

answer´ in themselves.answer´ in themselves.

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 3/21

Measures of Center Measures of Center 

MeanMean::

of a sample:of a sample:

MedianMedian::the value that divides the data intothe value that divides the data into

equal halves (*it may or may notequal halves (*it may or may not

be a value in the data set)be a value in the data set)

i x

 xn

!§ i

 x

n Q !

§

of apopulation:

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 4/21

The mean isthe balance

point of the

distribution

The median

divides the

distribution into

two equal areas.

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 5/21

1. Find the mean and median for 1. Find the mean and median for 

each list and contrast their each list and contrast their 

behavior:behavior:1.1. 1, 2, 61, 2, 6

2.2. 1, 2, 91, 2, 93.3. 1, 2, 2971, 2, 297

Practice:Practice:

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 6/21

Measures of SpreadMeasures of Spread

RangeRange: maximum: maximum -- minimumminimum

PercentilesPercentiles: the: the  p pthth percentile of a distribution is thepercentile of a distribution is thevalue such thatvalue such that  p p percent of the observationspercent of the observations

that fall at or below it (the median is the 50that fall at or below it (the median is the 50thth

percentile)percentile)

Quartiles:Quartiles: the lower quartile (Qthe lower quartile (Q11) is the 25) is the 25thth

percentile (or the median of the lower half) andpercentile (or the median of the lower half) andthe upper quartile (Qthe upper quartile (Q33) is the 75) is the 75thth percentile (or percentile (or the median of the upper half)the median of the upper half)

Interquartile Range (IQR)Interquartile Range (IQR) = Q= Q11

--QQ33

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 7/21

 Another Measure of Spread Another Measure of Spread

Standard Deviation (Std Dev)Standard Deviation (Std Dev)::

of a sample:of a sample:

of a population:of a population:

21( )

1 x i

  s x xn

!

§

21( )

 x i x xW  !

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 8/21

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 9/21

More  About Standard DeviationMore  About Standard Deviation

The differences of each value fromThe differences of each value fromthe mean are deviations:the mean are deviations:

Since the mean is the balance pointSince the mean is the balance pointof the distribution, the set of allof the distribution, the set of alldeviations from the mean will alwaysdeviations from the mean will alwaysadd to zero:add to zero:

TheThe VarianceVariance is:is:

TheThe Standard DeviationStandard Deviation is:is:

 x x

§ ! 0)( x x

21( )

1 x i

 s x x

n

!

§

2 21

( )1 x i  s x x

n! §

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 10/21

Practice:Practice:

2. For the sample: 1, 2, 4, 6, 92. For the sample: 1, 2, 4, 6, 9

a. Verify that the sum of the deviations froma. Verify that the sum of the deviations from

the mean is 0.the mean is 0.

b. Find the standard deviation by hand.b. Find the standard deviation by hand.

c. Find the standard deviation on thec. Find the standard deviation on the

graphing calculator.graphing calculator.

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 11/21

3. Without computing, match each list of numbers3. Without computing, match each list of numbers

on the left, with its SD on the right:on the left, with its SD on the right:

a.a. 1, 1, 1, 11, 1, 1, 1 i. 0i. 0

b.b. 1, 2, 21, 2, 2 ii. 0.058ii. 0.058

c.c. 1, 2, 3, 4, 51, 2, 3, 4, 5 iii. 0.577iii. 0.577

d.d. 10, 20, 2010, 20, 20 iv. 1.581iv. 1.581e.e. 0.1, 0.2, 0.20.1, 0.2, 0.2 v. 3.162v. 3.162

f.f. 0, 2, 4, 6, 80, 2, 4, 6, 8 vi. 3.606vi. 3.606

g.g. 0, 0, 0, 0, 5, 6, 6, 8, 80, 0, 0, 0, 5, 6, 6, 8, 8 vii. 5.774vii. 5.774

Practice:Practice:

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 12/21

The FiveThe Five--Number SummaryNumber Summary

TheThe FiveFive--Number SummaryNumber Summary includes:includes:

minimum, Qminimum, Q11, median, Q, median, Q33, maximum, maximum

It is used to createIt is used to create BoxplotsBoxplots..

The fiveThe five--number summary is usually better thannumber summary is usually better than

the mean and std dev for describing a skewedthe mean and std dev for describing a skeweddistribution or a distribution with strong outliers.distribution or a distribution with strong outliers.

UseUse  x  x --bar andbar and ss x  x only for reasonably symmetriconly for reasonably symmetric

distributions that are free of outliers.distributions that are free of outliers.

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 13/21

Danger Will Robinson, danger!!!Danger Will Robinson, danger!!!

While all of the methodWhile all of the method

discussed to compute numericaldiscussed to compute numerical

measures are very useful, theymeasures are very useful, they

should not be applied blindly.should not be applied blindly.

Statistical measures andStatistical measures and

methods based on them aremethods based on them aregenerally meaningful only for generally meaningful only for 

distributions of sufficientlydistributions of sufficiently

regular shape.regular shape.

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 14/21

What Happened to the Whiskers?What Happened to the Whiskers?

SideSide--byby--SideSide BoxplotsBoxplots::

maximum

minimum

medianQ1

Q3

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 15/21

Calculating OutliersCalculating Outliers

An observation is considered an An observation is considered an Outlier Outlier if it fallsif it fallsoutside the interval:outside the interval:

(Q(Q11

-- 1.51.5 �� IQR, QIQR, Q33

+ 1.5+ 1.5 �� IQR)IQR)

In general, it is not a good idea toIn general, it is not a good idea to

 just ignore or delete outliers, but just ignore or delete outliers, but

they do have a strong influence onthey do have a strong influence onthe data so sometimes calculationsthe data so sometimes calculations

are done with and without theare done with and without the

outliers and then compared.outliers and then compared.

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 16/21

The Influence of OutliersThe Influence of Outliers

ResistantResistant ± ± a summary statistic is resistanta summary statistic is resistant

to outliers if it is not changed very much if to outliers if it is not changed very much if 

the outlier is removed from the data set:the outlier is removed from the data set:

median, IQRmedian, IQR

SensitiveSensitive ± ± a summary statistic is sensitivea summary statistic is sensitive

to outliers if it tends to be affected byto outliers if it tends to be affected byoutliersoutliers

mean, range, standard deviationmean, range, standard deviation

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 17/21

Give Me a Graph, Baby!Give Me a Graph, Baby!

Remember that a graph gives the bestRemember that a graph gives the best

overall picture of a distribution. Numericaloverall picture of a distribution. Numerical

measures of center and spread reportmeasures of center and spread report

specific facts about a distribution, but theyspecific facts about a distribution, but they

do not describe its entire shape.do not describe its entire shape.  Always Always

plot your dataplot your data!!

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 18/21

Changing the Unit of MeasurementChanging the Unit of Measurement

 A change in the measurement unit is called A change in the measurement unit is called

aa Linear TransformationLinear Transformation..

 A linear transformation changes the A linear transformation changes the

original variableoriginal variable  x  x  into the new variableinto the new variable

 x  x new new by using the equation:by using the equation:

 x  x new new == aa ++ bx bx 

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 19/21

So What Does That Mean?So What Does That Mean?

 Adding the constant Adding the constant aa shifts allshifts all

of the values of of the values of  x  x  left or right byleft or right by

the same amount (the data isthe same amount (the data is

recenteredrecentered.).)

Multiplying by the positiveMultiplying by the positive

constantconstant bb changes the size of changes the size of 

the unit of measurement (thethe unit of measurement (the

data isdata is rescaledrescaled.).)

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 20/21

What Measures  Are Effected?What Measures  Are Effected?

 Adding Adding aa (recentering) changes the mean,(recentering) changes the mean,median, and quartiles bymedian, and quartiles by aa. However,. However,none of the measures of spread change.none of the measures of spread change.

Multiplying byMultiplying by bb (rescaling) multiplies both(rescaling) multiplies boththe measures of center and spread bythe measures of center and spread by bb..

Linear transformations do not change theLinear transformations do not change theshape of a distribution!shape of a distribution!

8/7/2019 1.2 Power Point

http://slidepdf.com/reader/full/12-power-point 21/21

PracticePractice

4. The mean height of a class of 15 children4. The mean height of a class of 15 childrenis 48 inches, the median is 45 inches, theis 48 inches, the median is 45 inches, the

standard deviation is 2.4 inches, and thestandard deviation is 2.4 inches, and the

IQR is 3 inches. Find the mean, median,IQR is 3 inches. Find the mean, median,

standard deviation, and IQR if«standard deviation, and IQR if«

a. you convert each height to feet.a. you convert each height to feet.

b. each child grows 2 inches.b. each child grows 2 inches.c. each child grows 4 inches and youc. each child grows 4 inches and you

convert their heights to feet.convert their heights to feet.