Upload
maximillian-lamb
View
224
Download
3
Embed Size (px)
Citation preview
Basic StatisticsBasic Statistics
Six Sigma FoundationsContinuous Improvement TrainingSix Sigma FoundationsContinuous Improvement Training
Six Sigma Simplicity
Key Learning PointsKey Learning Points
s Simple Statistics can:s Increase your Understanding of Process
Behaviors Helps Identify Improvement
Opportunities for 6S
s Simple Statistics can:s Increase your Understanding of Process
Behaviors Helps Identify Improvement
Opportunities for 6S
StatisticsStatisticss Common statistics:
s Miles per gallon (liter); mpg (mpl)s Median home pricess Consumer price indexs Inflation rates Stock market averages Airline on-time arrival rate
s Statistics are computed using data.s Statistics summarize the data and help us
to predict future performance.
s Common statistics:s Miles per gallon (liter); mpg (mpl)s Median home pricess Consumer price indexs Inflation rates Stock market averages Airline on-time arrival rate
s Statistics are computed using data.s Statistics summarize the data and help us
to predict future performance.
Basic StatisticsBasic Statistics
s Serve as a means to analyze data collected in the Measure phase.
s Allow us to numerically describe the data that characterizes our process’ Xs and Ys.
s Use past process and performance data to make inferences about the future.
s Serve as a foundation for advanced statistical problem-solving methodologies.
s Are a concept that creates a universal language based on numerical facts rather than intuition.
s Serve as a means to analyze data collected in the Measure phase.
s Allow us to numerically describe the data that characterizes our process’ Xs and Ys.
s Use past process and performance data to make inferences about the future.
s Serve as a foundation for advanced statistical problem-solving methodologies.
s Are a concept that creates a universal language based on numerical facts rather than intuition.
Data VisualizationData Visualization
s Before any statistical tools are applied, visually display and look at your data.
s A histogram allows us to look at how the data is distributed across our Y scale of measure.
s Before any statistical tools are applied, visually display and look at your data.
s A histogram allows us to look at how the data is distributed across our Y scale of measure.
Number of Wins for National Football League Teams (1998)
151050
5
4
3
2
1
0
Num
ber
of T
eam
s
Five teams won eight games
Source: AOLSports
Number of Games Won
Building a HistogramBuilding a Histogram
The following data came from our bicycle test facility: stopping distances required to bring a 150 lb weight to a complete stop with the rear brake applied from a 10 mph cruising speed.
The following data came from our bicycle test facility: stopping distances required to bring a 150 lb weight to a complete stop with the rear brake applied from a 10 mph cruising speed.
Trial (sample #) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Stop Distance (Feet) 14 6 13 7 10 10 11 9 11 9 11 9 10 10 10
Feet
X-Axis
Y-Axis
Fre
qu
ency
7 8 9 10 11 12 13 146
In addition to counting occurrences and graphing the results, we can describe processes in terms of central tendency and dispersion.
Measures of Central TendencyMeasures of Central Tendency
Measures of Central Tendencys Mean (m, Xbar)—The arithmetic average of a set
of valuess Uses the quantitative value of each data points Is strongly influenced by extreme values
s Median (M)—The number that reflects the middle of a set of valuess Is the 50th percentiles Is identified as the middle number after all the values are
sorted from high to lows Is not affected by extreme values
s Mode—The most frequently occurring value in a data set
Measures of Central Tendencys Mean (m, Xbar)—The arithmetic average of a set
of valuess Uses the quantitative value of each data points Is strongly influenced by extreme values
s Median (M)—The number that reflects the middle of a set of valuess Is the 50th percentiles Is identified as the middle number after all the values are
sorted from high to lows Is not affected by extreme values
s Mode—The most frequently occurring value in a data set
Central Tendency ExerciseCentral Tendency Exercise
s Determine the mean, median, and mode for the bicycle stopping distances used to create the histograms. Mean = ________Median = ________Mode = ________
s Determine the mean, median, and mode for the bicycle stopping distances used to create the histograms. Mean = ________Median = ________Mode = ________
Trial 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Stop Distance (Feet) 14 6 13 7 10 10 11 9 11 9 11 9 10 10 10
1201008060
120
80
40
0
Positive Skew
Fre
quen
cy
Mean
Median
Mode
806040200
100
50
0
Negative Skew
Fre
quen
cy
Mean
Median
Mode
11090705030
60
40
20
0
Normal
Fre
quen
cy
ModeMedianMean
Mean, Median, ModeMean, Median, Mode
Range (R)—The difference between the highest and lowest
Sample Variance (s2)—The average squared distance of each point from the average (Xbar)
Sample Standard Deviation(s)—The square root of the variance
Range (R)—The difference between the highest and lowest
Sample Variance (s2)—The average squared distance of each point from the average (Xbar)
Sample Standard Deviation(s)—The square root of the variance
minmax xxR
1
... - -
11
222
2
2
12
n
xxxxxx
n
n
ix
ix
s n
1
1
2
= 2
n
n
ix
ix
ss
Measures of DispersionMeasures of Dispersion
Example of Measures of DispersionExample of Measures of DispersionNumber of Wins for National Football League Teams (1998)
Source: AOLSports
151050
5
4
3
2
1
0
Fre
quen
cy
Range = 12
Xbar = 8
s2 = 11.72
s = 3.42
Dispersion ExerciseDispersion ExerciseFind measures of dispersion for the stopping distance data.Fill in the table at the right.
Range (R) = Variance (s2) = Std Dev (s) =
Find measures of dispersion for the stopping distance data.Fill in the table at the right.
Range (R) = Variance (s2) = Std Dev (s) =
A sample is just a subset of all possible values.
PopulationSample
Since the sample does not contain all the possible values, there is some uncertainty about the population. Hence any statistics, such as mean and standard deviation, are just estimates of the true population parameters.
Population vs. Sample(Certainty vs. Uncertainty)Population vs. Sample(Certainty vs. Uncertainty)
Sample Population
Mean(n = # of samples)
StandardDeviation(little “s”)
n
xx
n
ii
1 N
xN
ii
1
1
= 1
2
n
xx
s
n
ii
N
xN
ii
1
2
=
SymbolsSymbols
The Normal CurveThe Normal Curves In 80 to 90% of
problems worked, data will follow a normal bell curve or can be transformed to look like a normal curve.
s This curve is described by the Xbar and s “statistic.”
s The area under this curve is 1 or 100%.
s In 80 to 90% of problems worked, data will follow a normal bell curve or can be transformed to look like a normal curve.
s This curve is described by the Xbar and s “statistic.”
s The area under this curve is 1 or 100%.
s
X
For the normal curve, mean = median = mode.
Normal Bell Curve PropertiesNormal Bell Curve Propertiess Histograms (bar charts) are developed from samples.s Sample statistics (Xbar and s) are calculated from representatives
of the population.s From the histogram and sample statistics, we form a curve that
represents the population from which these samples were drawn.
s Histograms (bar charts) are developed from samples.s Sample statistics (Xbar and s) are calculated from representatives
of the population.s From the histogram and sample statistics, we form a curve that
represents the population from which these samples were drawn.
99.9999998% of the data fallswithin 6 standard deviations
from the mean
6sdX
99.73% of the data falls within 3 standard deviations from
the mean
3sdX
68.26% of the data falls within 1 standard deviation
from the mean
1sdX
0 100 200 300
0
10
20
Fre
quen
cy
1151059585
15
10
5
0
Normal
Fre
quen
cy
80 90 100 110 120
0
5
10 Uniform
Fre
quen
cy
5004003002001000
20
10
0
Exponential
Fre
quen
cy
Other Data DistributionsOther Data Distributions
Log Normal
2019181716151413121110987654321
5
4
3
2
1
0
Fre
quen
cyNormal Curve ExerciseNormal Curve Exercise
s Here is a histogram of the bike stopping distance data. (Xbar = 10 , s = 2)
s Does the histogram appear normal?s Draw vertical lines at 1sd, 2sd, 4sds Discuss
s Here is a histogram of the bike stopping distance data. (Xbar = 10 , s = 2)
s Does the histogram appear normal?s Draw vertical lines at 1sd, 2sd, 4sds Discuss
Basic StatisticsBasic Statistics
Six Sigma FoundationsContinuous Improvement TrainingSix Sigma FoundationsContinuous Improvement Training