View
219
Download
0
Category
Preview:
DESCRIPTION
3 3 Slide © 2003 South-Western/Thomson Learning TM Measures of Variation Variation Variance Standard Deviation Coefficient of Variation PopulationVariance SampleVariance PopulationStandardDeviation SampleStandardDeviation Range InterquartileRange
Citation preview
1 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Chapter 3Chapter 3 Descriptive Statistics: Descriptive Statistics:
Numerical MethodsNumerical Methods Measures of VariabilityMeasures of Variability Measures of Relative Location and Detecting Measures of Relative Location and Detecting
OutliersOutliers Exploratory Data AnalysisExploratory Data Analysis Measures of Association Measures of Association Between Two VariablesBetween Two Variables x
%
2 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Measures of VariabilityMeasures of Variability
It is often desirable to consider measures of It is often desirable to consider measures of variability (dispersion), as well as measures of variability (dispersion), as well as measures of location.location.
For example, in choosing supplier A or supplier B For example, in choosing supplier A or supplier B we might consider not only the average delivery we might consider not only the average delivery time for each, but also the variability in delivery time for each, but also the variability in delivery time for each. time for each.
RangeRange Inter-quartile RangeInter-quartile Range VarianceVariance Standard DeviationStandard Deviation Coefficient of VariationCoefficient of Variation
3 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Measures of VariationMeasures of Variation
VariationVariation
VariancVariancee
Standard Standard DeviationDeviation
Coefficient Coefficient of Variationof Variation
PopulatiPopulationon
VarianceVariance
Sample Sample VarianceVariance
PopulatioPopulationn
StandardStandardDeviationDeviationSample Sample Standard Standard DeviatioDeviationn
RangeRange
InterquartiInterquartile le
RangeRange
4 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Measures of variation give information on Measures of variation give information on the the spread spread oror variability variability of the data of the data values.values.
VariationVariation
Same center, Same center, different different variationvariation
5 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
RangeRange
Simplest measure of variationSimplest measure of variation Difference between the largest and the Difference between the largest and the
smallest observations:smallest observations:
Range = xRange = xmaximummaximum – x – xminimumminimum
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13Range = 14 - 1 = 13
Example:Example:
Chap 3-5
6 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Example: Apartment RentsExample: Apartment Rents
RangeRange Range = largest value - smallest Range = largest value - smallest
value value Range = 615 - 425 = 190Range = 615 - 425 = 190425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615
7 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Interquartile RangeInterquartile Range
The The interquartile rangeinterquartile range of a data set is the of a data set is the difference between the third quartile and the difference between the third quartile and the first quartile.first quartile.
It is the range for the It is the range for the middle 50%middle 50% of the data. of the data.
8 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Example: Apartment RentsExample: Apartment Rents
Interquartile RangeInterquartile Range 3rd Quartile (3rd Quartile (QQ3) = 5253) = 525 1st Quartile (1st Quartile (QQ1) = 4451) = 445
Interquartile Range = Interquartile Range = QQ3 - 3 - QQ1 = 525 - 445 = 1 = 525 - 445 = 8080425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615
9 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
VarianceVariance
The The variancevariance is a measure of variability that is a measure of variability that utilizes all the data.utilizes all the data.
It is based on the difference between the value It is based on the difference between the value of each observation (of each observation (xxii) and the mean () and the mean (xx for a for a sample, sample, for a population). for a population).
10 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
VarianceVariance
The variance is the The variance is the average of the squared average of the squared differencesdifferences between each data value and the between each data value and the mean.mean.
If the data set is a sample, the variance is If the data set is a sample, the variance is denoted by denoted by ss22. .
If the data set is a population, the variance is If the data set is a population, the variance is denoted by denoted by 22..
sxi xn
22
1
( )
22
( )xNi
11 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Variance for Grouped DataVariance for Grouped Data
Sample DataSample Data
Population DataPopulation Data1
)( 2
2
n
xXfs ii
NXf ii
2
2)(
12 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Standard DeviationStandard Deviation
Most commonly used measure of variationMost commonly used measure of variation Shows variation about the meanShows variation about the mean The The standard deviationstandard deviation of a data set is the of a data set is the
positive square root of the variance.positive square root of the variance. If the data set is a sample, the standard If the data set is a sample, the standard
deviation is denoted deviation is denoted ss..
If the data set is a population, the standard If the data set is a population, the standard deviation is denoted deviation is denoted (sigma). (sigma).
s s 2
2
13 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Calculation Example:Calculation Example:Sample Standard Sample Standard
DeviationDeviationSample Sample
Data (XData (Xii) : 10 12 14 15 17 18 18 ) : 10 12 14 15 17 18 18 2424
n = 8 Mean = x = 16n = 8 Mean = x = 16
4.24267
126
1816)(2416)(1416)(1216)(10
1n)x(24)x(14)x(12)x(10s
2222
2222
14 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Coefficient of VariationCoefficient of Variation
Measures relative variationMeasures relative variation Always in percentage (%)Always in percentage (%) Shows variation relative to meanShows variation relative to mean Is used to compare two or more sets of data Is used to compare two or more sets of data
measured in different units measured in different units
100%xsCV
100%
μσCV
Population Population SampleSample
15 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Example: Apartment RentsExample: Apartment Rents
VarianceVariance
Standard DeviationStandard Deviation
Coefficient of VariationCoefficient of Variation
sxi xn
22
12 996 16
( ), .
s s 2 2996 47 54 74. .
sx
100 54 74490 80
100 11 15..
.
16 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Measures of Relative LocationMeasures of Relative Locationand Detecting Outliersand Detecting Outliers
z-Scoresz-Scores Detecting OutliersDetecting Outliers
17 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
zz-Scores-Scores
The The zz-score-score is often called the standardized is often called the standardized value.value.
It denotes the number of standard deviations a It denotes the number of standard deviations a data value data value xxii is from the mean. is from the mean.
A data value less than the sample mean will have A data value less than the sample mean will have a a zz-score less than zero.-score less than zero.
A data value greater than the sample mean will A data value greater than the sample mean will have a have a zz-score greater than zero.-score greater than zero.
A data value equal to the sample mean will have A data value equal to the sample mean will have a a zz-score of zero.-score of zero.
z x xsii
18 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
zz-Score of Smallest Value (425)-Score of Smallest Value (425)
Standardized Values for Apartment RentsStandardized Values for Apartment Rents
z x xsi
425 490 80
54 741 20.
..
-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.451.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27
Example: Apartment RentsExample: Apartment Rents
19 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Detecting OutliersDetecting Outliers
An An outlieroutlier is an unusually small or unusually is an unusually small or unusually large value in a data set.large value in a data set.
A data value with a z-score less than -3 or A data value with a z-score less than -3 or greater than +3 might be considered an greater than +3 might be considered an outlier. outlier.
It might be an incorrectly recorded data value.It might be an incorrectly recorded data value. It might be a data value that was incorrectly It might be a data value that was incorrectly
included in the data set.included in the data set.
20 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Example: Apartment RentsExample: Apartment Rents
Detecting OutliersDetecting OutliersThe most extreme z-scores are -1.20 and The most extreme z-scores are -1.20 and
2.27.2.27.Using |Using |zz| | >> 3 as the criterion for an 3 as the criterion for an
outlier, outlier, there are no outliers in this data set. there are no outliers in this data set.
Standardized Values for Apartment RentsStandardized Values for Apartment Rents-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.451.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27
21 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Exploratory Data AnalysisExploratory Data Analysis
Five-Number SummaryFive-Number Summary
22 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Five-Number SummaryFive-Number Summary
Smallest ValueSmallest Value First QuartileFirst Quartile MedianMedian Third QuartileThird Quartile Largest ValueLargest Value
23 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Example: Apartment RentsExample: Apartment Rents
Five-Number SummaryFive-Number SummaryLowest Value = 425Lowest Value = 425 First Quartile First Quartile
= 450= 450 Median = 475Median = 475
Third Quartile = 525 Largest Value Third Quartile = 525 Largest Value = 615= 615425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615
24 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Measures of Association Measures of Association between Two Variablesbetween Two Variables
CovarianceCovariance Correlation CoefficientCorrelation Coefficient
25 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
CovarianceCovariance
The The covariancecovariance is a measure of the linear is a measure of the linear association between two variables.association between two variables.
Positive values indicate a positive relationship.Positive values indicate a positive relationship. Negative values indicate a negative Negative values indicate a negative
relationship.relationship.
26 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
If the data sets are samples, the covariance is If the data sets are samples, the covariance is denoted by denoted by ssxyxy..
If the data sets are populations, the covariance If the data sets are populations, the covariance is denoted by .is denoted by .
CovarianceCovariance
s x x y ynxy
i i
( )( )
1
xyi x i yx y
N
( )( )
xy
27 Slide
© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM
Correlation CoefficientCorrelation Coefficient
The coefficient can take on values between -1 and The coefficient can take on values between -1 and +1.+1.
Values near -1 indicate a Values near -1 indicate a strong negative linear strong negative linear relationshiprelationship..
Values near +1 indicate a Values near +1 indicate a strong positive linear strong positive linear relationshiprelationship..
If the data sets are samples, the coefficient is If the data sets are samples, the coefficient is rrxyxy..
If the data sets are populations, the coefficient is .If the data sets are populations, the coefficient is .
rss sxyxy
x y
xyxy
x y
xy
Recommended