ARI 04 Basic Data Analysis

Embed Size (px)

Citation preview

  • 8/12/2019 ARI 04 Basic Data Analysis

    1/19

    Delphi. All rights reserved. Core tools I 032707HO

    Overview and Theory of Distributions Histogram

    Normal Probability Plots Identifying a Distribution Data Examples

  • 8/12/2019 ARI 04 Basic Data Analysis

    2/19

    2

    Delphi. All rights reserved. Core tools I 032707HO

    Properties Of A Normal DistributionThe normal distribution is the concept that is the basis formost statistical tests.

    Completely describedby its mean and

    standard deviation Tails extend to

    Area under curve

    represents 100% of possible observations Curve is symmetrical ;

    50% each side of mean

    60

    0

    100

    200

    300

    F r e q u e n c y

    70 80 90Days

    100 110 120

  • 8/12/2019 ARI 04 Basic Data Analysis

    3/19

    3

    Delphi. All rights reserved. Core tools I 032707HO

    Normal DistributionThe mean and standarddeviation are required to

    fully describe thedistribution.

    Compare the means of

    each distribution.

    The means are the same but the standard deviations differ.

    3 rd Distribution

    Mean

    1 st Distribution

    2nd Distribution

  • 8/12/2019 ARI 04 Basic Data Analysis

    4/19

    4

    Delphi. All rights reserved. Core tools I 032707HO

    95.44%

    68.26%

    99.73%

    43210-1-2-3-4

    .60-75%

    .90-98%

    .99-100%

    The Standard Normal CurveThe standard normal curve is a special case of the normaldistribution where the mean = 0 and the standard deviation= 1.

    Theoretical Empirical

    99.7% of the population is within approximately 3 standard deviations of the mean.

  • 8/12/2019 ARI 04 Basic Data Analysis

    5/19

    Delphi. All rights reserved. Core tools I 032707HO

    Histogram Purpose: To show the

    shape of the data

    Applications: Show Variation or

    range of data Performance of data

    around a nominal target To understand the

    amount of data at agiven point

    To find outliers in theprocess

    5

    10

    15

    20.5 23.5 26.5 29.5 32.5 35.5 38.5 41.5

  • 8/12/2019 ARI 04 Basic Data Analysis

    6/19

    6 Delphi. All rights reserved. Core tools I 032707HO

    Anatomy Of A Histogram A. Vertical axis Either Frequency or the Percentage of data points ineach Class

    B. Modal Class Class with the highest frequency

    C. Frequency Number of data points found in each ClassD. Class Each bar is one Class, or interval, or binE. Horizontal axis Scale of measure for the element being plotted

    A

    900800700600500400300200100

    60

    50

    40

    30

    20

    10

    0

    F r e q u e n c y

    E

    C

    D

    B

  • 8/12/2019 ARI 04 Basic Data Analysis

    7/19

    7 Delphi. All rights reserved. Core tools I 032707HO

    Open MINITAB data file: 12a_Basic_Stats.mpj Run: Stat Basic Statistics Display

    Descriptive Statistics Highlight all columns from Normal to PS 2 andclick Select

    Note that the mean and StDev ofNormal, Pos Skew and NegSkew are identical.

    Variable N Mean Median StDev SE Mean Normal 500 70.000 69.977 10.000 0.447

    Pos Skew 500 70.000 65.695 10.000 0.447 Neg Skew 500 70.000 73.783 10.000 0.447 Mystery 500 100.00 104.20 32.38 1.45

    PS 2 500 70.010 66.000 9.981 0.446

    12a_ Basic_Stats.MPJ

    Follow-me Histogram Example

  • 8/12/2019 ARI 04 Basic Data Analysis

    8/19

    8 Delphi. All rights reserved. Core tools I 032707HO

    Remembering that the means and standard deviationsof the first three data sets were the same Lets graphthem using histograms

    Run: Graph Histogram (Select Simple, Click OK) Enter Normal, Pos Skew and Neg Skew into

    Graph Variables field

    Click OK

    Histogram in Minitab

  • 8/12/2019 ARI 04 Basic Data Analysis

    9/19

    9 Delphi. All rights reserved. Core tools I 032707HO

    What The Graphs Show:3 Different Distributions

    Neg Skew

    F r e q u e n c y

    7260483624120

    250

    200

    150

    100

    50

    0

    Histogram of Neg Skew

    Pos Skew

    F r e q u e n c y

    130120110100908070

    140

    120

    100

    80

    60

    40

    20

    0

    Histogram of Pos Skew

    Normal

    F r e q u e n c y

    10090807060504030

    70

    60

    50

    40

    30

    20

    10

    0

    Histogram of Normal

  • 8/12/2019 ARI 04 Basic Data Analysis

    10/19

    Delphi. All rights reserved. Core tools I 032707HO

    Using the data file: 12a_Basic_Stats.mpjGraph Histogram With fit and Groups

    Select Normal Mystery as graph variables

    Histogram with Groups

  • 8/12/2019 ARI 04 Basic Data Analysis

    11/19

    Delphi. All rights reserved. Core tools I 032707HO

    Mean & Std Devfor 2 data sets

    Histogram with Groups

  • 8/12/2019 ARI 04 Basic Data Analysis

    12/19

    Delphi. All rights reserved. Core tools I 032707HO

    Data Sets Come inVarious

    Shapes

    Left Skewed (Negatively Skewed)

    Uniform Distribution

    Bimodal Distribution

    Bell Shape The Normal Distribution

    Right Skewed (Positively Skewed)Ex: Tool Wear during machining process

    Ex: Random Variation on a stable process

    Ex: Torque, capacity of a container

    Ex: Pre Sorting, Measurement System not sensitive enough

    Ex: Pre Sorting, Measurement System not sensitive enough

  • 8/12/2019 ARI 04 Basic Data Analysis

    13/19

    13 Delphi. All rights reserved. Core tools I 032707HO

    Normal Probability Plots Normal probability plots are a graphical technique to

    determine if a distribution is normally distributed

    Using the previous data file: 12a_Basic_Stats.mpj Stat Basic Stats Normality Test

    Select Normal as Variable

    12a_ Basic_Stats.MPJ

  • 8/12/2019 ARI 04 Basic Data Analysis

    14/19

    14 Delphi. All rights reserved. Core tools I 032707HO

    Interpreting The Normal Probability Plot Normally distributed data will appear on the plot as a straight line

    Generate plots for Pos Skew, Neg Skew, and Mystery Are they normal?

    N o r m a l

    P e r c e n t

    1 1 01 0 09 08 07 06 05 04 03 0

    99.9

    99

    9590

    80

    706050403020

    10

    5

    1

    0.1

    M ean

    0.328

    70.00S tD e v 10 .00N 500

    A D 0.418P -V alu e

    P r o b a b i l i ty P l o t o f N o r m a lNor m a l

  • 8/12/2019 ARI 04 Basic Data Analysis

    15/19

    Delphi. All rights reserved. Core tools I 032707HO

    Examples:

    13012011010090807060

    300

    200

    100

    0

    C2

    F r e q u e n c y

    Normal Probability Plots

    1101009080706050403020

    100

    50

    0

    C1

    F r e q u e n c y

    Normal Probability Plots

    1069686766656463626

    .999

    .99

    .95

    .80

    .50

    .20

    .05.01

    .001

    P r o b a b i l i t y

    Normal

    p-value: 0.328 A-Squared: 0.418

    Anderson-Darling Normality Tes t

    N of data: 500Std Dev: 10

    Average: 70

    Normal D istribution

    13012011010090807060

    .999.99.95

    .80

    .50

    .20

    .05.01

    .001

    P r o b a b i l i t y

    Pos Skew

    p-value: 0.000 A-Squared: 46.447

    Anderson-Darling Normality Tes t

    N of data: 500Std Dev: 10

    Average: 70

    Positive Skewed D istribution

    Normal bell shaped

    Not Normal Positive Skewed

    P-value:0.328

    P-value:0.000

  • 8/12/2019 ARI 04 Basic Data Analysis

    16/19

    16 Delphi. All rights reserved. Core tools I 032707HO

    Anderson-Darling Test If the p-value from the Anderson-Darling test < alpha of .05,

    the data is not normal per that test, however: The Anderson-Darling test is not robust to small sample

    sizes so for samples less than 50 it is best to rely on the

    Fat Pencil test. If a fat pencil can cover all of the pointson the normal probability plot, the data may safely betreated as normal

    For large samples the Anderson-Darling can measureslight departures from normality that will have little or noeffect on the level of analysis that we will be performing.

    Again use the Fat Pencil test to determine reasonablenormality.

  • 8/12/2019 ARI 04 Basic Data Analysis

    17/19

    Delphi. All rights reserved. Core tools I 032707HO

    Identifying an appropriate distribution

    This function allows you to fit your data with 14distributions.

    Use individual distribution identification to find a useful

    distribution to fit the data if a normal distribution does notfit the data well. Why? In capability analysis , finding anappropriate distribution to fit the data is extremely

    important.Roughness.MPJ

    Open the data file:

    Roughness.mpj

    Choose Stat Quality ToolsIndividual Distribution

    IdentificationComplete the dialog box asshown

  • 8/12/2019 ARI 04 Basic Data Analysis

    18/19

    Delphi. All rights reserved. Core tools I 032707HO

    Interpreting your results

  • 8/12/2019 ARI 04 Basic Data Analysis

    19/19