154
Basic Statistical Basic Statistical Methods Methods SHC790 SHC790 J vd Westhuizen J vd Westhuizen

Basic Statistical Methods2004PDF

Embed Size (px)

DESCRIPTION

basic stats

Citation preview

  • Basic Statistical Basic Statistical MethodsMethods

    SHC790SHC790J vd WesthuizenJ vd Westhuizen

  • ContextContext

    Data DescriptionData Description ProbabilityProbability InferenceInference

  • Data DescriptionData Description

  • Data DescriptionData Description

    How can we summarise small How can we summarise small amounts of data?amounts of data? Remember that the following is Remember that the following is

    required before we continue:required before we continue: Data needs to be a random sampleData needs to be a random sample How accurately were the How accurately were the

    measurements mademeasurements made Notice the difference between Notice the difference between

    measurements and countsmeasurements and counts Measurements Measurements More accurateMore accurate Counts Counts Always whole numbersAlways whole numbers

  • Data DescriptionData Description

  • Data DescriptionData Description

    Example TableExample Table Bridge Column Bridge Column

    Support HeightSupport Height Most important Most important

    thing about thing about numbers in this listnumbers in this list How low a How low a

    typical bridge is.typical bridge is. How variable How variable

    the numbers the numbers are.are. 8.898.8999

    8.278.2788

    8.528.5277

    7.997.9966

    9.029.0255

    8.988.9844

    8.158.1533

    8.078.0722

    8.788.7811

    Height (m)Height (m)NumberNumber

  • Data DescriptionData Description

    Mean HeightMean Height

    Symbol for Symbol for mean if mean if observations is observations is referred to as referred to as xsxs

    8.898.8999

    8.278.2788

    8.528.5277

    7.997.9966

    9.029.0255

    8.988.9844

    8.158.1533

    8.078.0722

    8.788.7811

    Height (m)Height (m)NumberNumber

    518.89

    89.827.852.899.702.998.815.807.878.8

    =

    ++++++++=HeightMeanBridge

    nxx

    ni i =

    =1

  • Data DescriptionData Description

    Median HeightMedian Height

    Bridge Median Height = 8.52Bridge Median Height = 8.52

    If there is an even If there is an even number of observations number of observations we take the average of we take the average of the two middle ones the two middle ones

    9.029.0255

    8.988.9844

    8.898.8999

    8.788.7811

    8.528.5277

    8.278.2788

    8.158.1533

    8.078.0722

    7.997.9966

    Height (m)Height (m)NumberNumber

  • Data DescriptionData Description

    The ModeThe Mode The mode is the observation that The mode is the observation that

    occurs most frequently. In our occurs most frequently. In our previous example there are no previous example there are no mode.mode.

    This is more useful for counted This is more useful for counted data. For example number of data. For example number of vehicles per hour over a 24 hour vehicles per hour over a 24 hour period.period.

  • Data DescriptionData Description

    Guidelines for mean, median and Guidelines for mean, median and modesmodes In case of no outliers use mean.In case of no outliers use mean. In case of outliers In case of outliers

    Median Mode

    Meas

    urem

    ents Counts

    Vehicles per hour - peak hours have a substantial

    influence

  • Data DescriptionData Description

    Other measures of location (See Other measures of location (See 10)10) Root mean squareRoot mean square

    Geometric mean of n positive Geometric mean of n positive numbers = nth root of their numbers = nth root of their productproduct

    Harmonic meanHarmonic mean

  • Data DescriptionData Description

    Measure of variationMeasure of variation RangeRange

    The range of a set of numbers is the The range of a set of numbers is the largest minus the smallestlargest minus the smallest

    The range of bridge height is 9.02 The range of bridge height is 9.02 7.99 = 1.03 m7.99 = 1.03 m

    Its very sensitive Its very sensitive To unusually big To unusually big or small observationsor small observations

  • Data DescriptionData Description

    IQR and SIQR (InterIQR and SIQR (Inter--quartile quartile Range and SemiRange and Semi--InterInter--quartile quartile range) range) Distance between Distance between quartilesquartiles QuartileQuartile

    25% 25% 25% 25%

    Separate ordered data into 4 equal groups

    1st quartile 2nd quartile 3rd quartile(median)(Lower) (Upper)

  • Data DescriptionData Description

    To determine the lower and To determine the lower and upper quartiles we recommend upper quartiles we recommend the following:the following: Lower quartileLower quartile

    0.25n+0.5 (n 0.25n+0.5 (n number of number of observation)observation)

    Upper quartileUpper quartile 0.75n+0.50.75n+0.5

  • Data DescriptionData Description

    ExampleExample n=9n=9 LQ = 0.25(9)+0.5 = 2.75LQ = 0.25(9)+0.5 = 2.75 UQ = 0.75(9)+0.5 = 7.25UQ = 0.75(9)+0.5 = 7.25

    9.029.0255

    8.988.9844

    8.898.8999

    8.788.7811

    8.528.5277

    8.278.2788

    8.158.1533

    8.078.0722

    7.997.9966

    Height (m)Height (m)NumberNumber

    LQ = 8.13

    UQ = 8.91 IQR = 8.91 8.13 = 0.78

  • Data DescriptionData Description Mean Absolute Mean Absolute

    Deviation(MAD)Deviation(MAD)

    354.0919.3..

    =

    =DAM xx xx

    3.190.00Sum

    0.370.378.898.89

    0.25-0.258.278.27

    0.000.008.528.52

    0.53-0.537.997.99

    0.500.509.029.02

    0.460.468.988.98

    0.37-0.378.158.15

    0.45-0.458.078.07

    0.260.268.788.78

    Height (m) Height (m) --xx

    Absolute deviation from mean

    Deviation from meanObservation

    nxx

    DAM

    =..

  • Data DescriptionData Description

    Standard DeviationStandard Deviation SD is approximately the average of the SD is approximately the average of the

    difference between each value and the difference between each value and the mean.mean.

    SD is what is most commonly used as a SD is what is most commonly used as a measure of variationmeasure of variation

    The square of the SD is called varianceThe square of the SD is called variance

    ( )1

    2

    =

    nxxS

    2var S=

  • Data DescriptionData Description

    Standard Deviation Standard Deviation -- ExampleExample

    ( )1

    2

    =

    nxxS

    1.34890Sum

    0.13690.378.898.89

    0.0625-0.258.278.27

    008.528.52

    0.2809-0.537.997.99

    0.250.59.029.02

    0.21160.468.988.98

    0.1369-0.378.158.15

    0.2025-0.458.078.07

    0.06760.268.788.78

    Height (m) Height (m) --xx

    Squared deviation from mean

    Deviation from meanObservation

    S =0.411

    xx ( )2xx

  • Data DescriptionData Description

    Standard Deviation Standard Deviation Typical Typical ObservationsObservations

    Large dataset Large dataset (1000 observations)(1000 observations)

    Few observations futher Few observations futher than 3 SD from the meanthan 3 SD from the mean

    MediumMedium--sized dataset sized dataset (100 observations)(100 observations)

    Few observations futher Few observations futher than 2 SD from the meanthan 2 SD from the mean

    Small datasetSmall dataset(10 observations)(10 observations)

    Few observations futher Few observations futher than 1 SD from the meanthan 1 SD from the mean

    Dataset sizeDataset sizeDescriptionDescription

  • Data DescriptionData Description

    The boxThe box--andand--whisker plotwhisker plot

    Med

    ian

    20

    10 20 30 40 50 60 70

    LQ UQ

    55

    Whiskers

  • Data DescriptionData Description

    Cumulative frequency plotCumulative frequency plot Vertical axis represents the Vertical axis represents the

    number/percentage of observations number/percentage of observations that are less than or equal to the x that are less than or equal to the x value on the horizontal axisvalue on the horizontal axis

  • Data DescriptionData Description

    Flood ExampleFlood Example

    SIZE OF FLOOD

    0 12 13 16 17 20 26 33 38 50 61 70

    FREQ

    UEN

    CY

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

  • Data DescriptionData Description

    Flood Example Flood Example IQRIQR The boxThe box--andand--whisker plotwhisker plot

    SIZE OF FLOOD

    0 12 13 16 17 20 26 33 38 50 61 70

    FREQ

    UEN

    CY

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    16 23 3812 61

  • Data DescriptionData Description

    Large DatasetsLarge Datasets See arm length measurements See arm length measurements

    below.below.

    723800725784708795758757791761

    737777780763765860812754776788

    787835777808761727811765751759

    771830767811767774802706796801

    817829805834802801811831817801

    803776796806786757833811791796

    801809727804781763774747783764

    816720802789830854830795785759

    809740798775806754754758694765

    810786767782833846845846817741

  • Data DescriptionData Description

    Tabulate this dataTabulate this data Choosing classesChoosing classes

    Use convenient round Use convenient round numbersnumbers

    Sufficient many classes Sufficient many classes not wide groupingnot wide grouping

    Sufficient few classes Sufficient few classes between 5 between 5 15 classes15 classes

    Each observation must Each observation must go into only one classgo into only one class

    1000870 and over

    1002850-869

    9811830-849

    8711810-829

    7616800-809

    607790-799

    5311780-789

    428770-779

    3412760-769

    2213740-759

    96720-739

    32700-719

    11Up to 699

    CumulativeCount ObservationsRange

    Example 1

  • Data DescriptionData Description StemStem--andand--leaf plotsleaf plots

    69 470 6 67172 0 7 7 5 373 774 2 0 775 8 4 4 9 7 9 1 4 7 876 7 5 4 3 9 7 7 5 1 5 3 177 5 4 6 4 1 7 6 778 2 6 5 9 3 1 6 7 8 0 379 8 5 6 1 6 1 580 6 9 2 4 9 1 6 3 1 1 2 5 1 2 8 081 7 0 6 1 7 1 7 1 1 282 983 3 0 0 3 1 4 0 584 6 5 685 486 0

  • Data DescriptionData Description

    Histogram presentationHistogram presentation

    Arm Length (mm)

    650 700 750 800 850 900

    Prop

    ortio

    n of

    obs

    erva

    tion

    per c

    lass

    wid

    th

    0

    2

    4

    6

    8

    10

    12

    14

    16

    Note the change in range size and the frequency half at range doublethe size of the selected standard

    range.Rather keep range size constant

  • Data DescriptionData Description

    Cumulative frequency plotCumulative frequency plot

    Arm Length (mm)

    680 700 720 740 760 780 800 820 840 860 880

    % O

    bser

    vatio

    ns

    =

    =

    ==

    =

  • Data DescriptionData Description

    Temperature Rainfall exampleTemperature Rainfall example Class ExampleClass Example

    2.22.23.33.32.92.93.23.24.74.73.13.13.93.93.23.23.43.43.03.03.63.63.13.1

    RainfallRainfallTemperatureTemperature

  • Temperature (Thousands of degrees)

    2.9 3.0 3.1 3.2 3.3

    Rai

    nfal

    l (H

    unde

    rds

    mm

    )

    1.5

    2.0

    2.5

    3.0

    3.5

    4.0

    4.5

    5.0

    Data DescriptionData Description

    Temperature Rainfall exampleTemperature Rainfall example Class ExampleClass Example

  • Data DescriptionData Description

    Temperature Rainfall exampleTemperature Rainfall example Class ExampleClass Example

    64.9575.0759.5920.718.9

    7.264.8410.892.23.3

    9.288.4110.242.93.2

    14.5722.099.614.73.1

    12.4815.2110.243.93.2

    10.211.5693.43

    11.1612.969.613.63.1

    Temperature-Rainfall(Rainfall)2(Temperature)2RainfallTemperature

  • Data DescriptionData Description

    Temperature Rainfall exampleTemperature Rainfall example Class ExampleClass Example

    57.0655.3055.0

    255.0

    =

    =

    =

    r

    r

    SSS

    ryyxx

    xy

    Sxx = 59.59 1/6(18.9)2 = 0.055

    Syy = 75.07 1/6(20.7)2 = 3.655

    Sxy = 64.95 1/6(18.9)(20.7) = -0.255

    ( ) 15.415.064.445.364.4055.0255.0

    ==

    =

    =

    =

    ab

    b

    SS

    bxx

    xy

  • ProbabilityProbability

    What is meant by probability?What is meant by probability? The probability of an event is the The probability of an event is the

    chance that it will occur.chance that it will occur. A priori approach.A priori approach.

    Sometimes, the experimental setSometimes, the experimental set--up is so clear, we know the up is so clear, we know the probabilities in advance of probabilities in advance of collecting any data collecting any data

  • ProbabilityProbability

    A priori approach.A priori approach. ExamplesExamples

    CoinCoin P[x = head] = P[x = tail] = 0.5P[x = head] = P[x = tail] = 0.5

    DiceDice P[x = 1] = P[x = 2] = P[x = 3] = P[x = 4] P[x = 1] = P[x = 2] = P[x = 3] = P[x = 4]

    =P[x = 5] = P[x = 6] = 1/6=P[x = 5] = P[x = 6] = 1/6

    CardsCards P[x = 4 of harts] = 1/52P[x = 4 of harts] = 1/52 P[x = Ace] = 4/52 = 1/13P[x = Ace] = 4/52 = 1/13

  • ProbabilityProbability

    EmpiricalEmpirical By having enough experimental By having enough experimental

    datadata ExamplesExamples

    Break 100 concrete cubes. 30 cubes Break 100 concrete cubes. 30 cubes strength is more than 50MPastrength is more than 50MPa

    P[x > 50] = 30/100 = 0.3P[x > 50] = 30/100 = 0.3

  • ProbabilityProbability

    Subjective assessmentsSubjective assessments What is the probability that it will What is the probability that it will

    rain on the 27rain on the 27thth of July in Pretoria.of July in Pretoria. Argument: During winter iArgument: During winter itt does does

    not rain regularly in Pretoria.not rain regularly in Pretoria. No experimental informationNo experimental information Intuition:Intuition:

    P[x = rain] = 0.05P[x = rain] = 0.05

  • ProbabilityProbability

    Probability ScaleProbability Scale

    Prob

    abilit

    y

    Swim through the Atlantic ocean

    Dice

    Coin

    Pass Statistics

    Dying 1.00.9

    0.5

    0.167

    0

  • ProbabilityProbability

    MutuallyMutually--exclusive eventsexclusive events This means they cannot occur This means they cannot occur

    together.together. Consequently, the Consequently, the probabilityprobability of of

    one one oror the other of two mutuallythe other of two mutually--exclusive events occurring is the exclusive events occurring is the sum of their individual sum of their individual probabilitiesprobabilities..

  • ProbabilityProbability

    Example Example MutuallyMutually--exclusive eventsexclusive events Number of students on campus is Number of students on campus is

    25000.25000. B.Eng = 5000B.Eng = 5000 B.Com = 8000B.Com = 8000 Choosing a studentChoosing a student

    P[B.Eng] = 5000/25000 = 0.2P[B.Eng] = 5000/25000 = 0.2 P[B.Com] = 8000/25000 = 0.32P[B.Com] = 8000/25000 = 0.32

    0][52.032.02.0][

    =

    =+=

    BComBIngPBComBIngP

  • ProbabilityProbability

    Venn Diagram Venn Diagram MutuallyMutually--exclusiveexclusive

    B.Com

    B.Eng

    Other

    P[B.Eng]0.2

    P[B.Com]0.32

    P[Other]0.48Sum of Ps = 1

  • ProbabilityProbability

    Independent probabilityIndependent probability Where two experiments is not influencing Where two experiments is not influencing

    each other.each other. Probability of both happening is the Probability of both happening is the

    productproduct of their of their individual probabilitiesindividual probabilities ExampleExample

    What sex is a person? Man What sex is a person? Man A, Women A, Women BB Does the person own a vehicle? Yes Does the person own a vehicle? Yes C, No C, No

    DD P[A] = 0.45, P[B] = 0.55, P[C] = 0.8, P[D] = 0.2P[A] = 0.45, P[B] = 0.55, P[C] = 0.8, P[D] = 0.2

  • ProbabilityProbability

    ExampleExample

    No

    No

    Ve

    hicl

    e Ve

    hicl

    e -- Y

    esYes

    Man Sex Women

    0.09 0.11

    0.36 0.44

    [ ][ ][ ][ ] 11.02.055.0

    44.08.055.009.02.045.036.08.045.0

    ==

    ==

    ==

    ==

    DBPCBPDAPCAP

  • ProbabilityProbability

    General addition RuleGeneral addition Rule

    A C

    A is made up of {A but not C} plus {A and C}C is made up of {C but not A} plus {A and C}Therefore A or C is {A} plus {C}, minus {A and C}

  • ProbabilityProbability

    ExampleExample

    No

    No

    Ve

    hicl

    e Ve

    hicl

    e -- Y

    esYes

    Man Sex Women

    0.09 0.11

    0.36 0.44

    [ ][ ][ ][ ]

    91.044.08.055.0][][][][89.036.08.045.0][][][][

    11.02.055.044.08.055.009.02.045.036.08.045.0

    =+=+=

    =+=+=

    ==

    ==

    ==

    ==

    CBPCPBPCBPCAPCPAPCAP

    DBPCBPDAPCAP

  • ProbabilityProbability

    Same Example Same Example Probability TreeProbability Tree

    Sex

    Vehicle Vehicle

    ManP=0.45

    WomanP=0.55

    YesP=0.8

    YesP=0.8

    NoP=0.2

    NoP=0.2

    0.36 0.09 0.44 0.11

  • ProbabilityProbability

    General Multiplication RuleGeneral Multiplication Rule

    A C

    yprobabilitlconditionaaastoreferredisACPthatgivenmeanslineverticaltheWhere

    CAPCPCandAPor

    ACPAPCandAP

    }\{""\

    }\{}{}{

    }\{}{}{

    =

    =

  • ProbabilityProbability

    ExampleExample 12 people12 people 9 9 Native bornNative born 3 3 Foreign bornForeign born If we select 2 people, what is the If we select 2 people, what is the

    probability that both are foreign born?probability that both are foreign born? P[F1] = 3/12 = 0.25P[F1] = 3/12 = 0.25 Once F occurred we know that there are 11 Once F occurred we know that there are 11

    remaining of whom 2 are foreign born.remaining of whom 2 are foreign born. P[F2P[F2\\F1] = 2/11 = 0.1818F1] = 2/11 = 0.1818 P[F1 and F2] = 0.25 x 0.1818 = 0.04545 = P[F1 and F2] = 0.25 x 0.1818 = 0.04545 =

    1/221/22

  • ProbabilityProbability

    ExampleExample 200 Students200 Students 77 Accounting77 Accounting 64 Law64 Law 92 Study neither92 Study neither Other 3 numbers to Other 3 numbers to

    total 200 total 200 92 = 10892 = 108 Only Accounting =108 Only Accounting =108

    64 = 4464 = 44 Only Law = 108 Only Law = 108 77 = 77 =

    3131 Both = 77 Both = 77 44 or 64 44 or 64

    31 = 3331 = 33 74

    7744]\[

    165.020033]&[

    ==

    ==

    AAOnlyP

    LAP

    44 31

    92

    33

    A L

  • ProbabilityProbability

    Class ExampleClass Example A concrete beam will fail if the A concrete beam will fail if the

    concrete is to weak or the load concrete is to weak or the load is to high.is to high.

    P[Weak] = 0.2P[Weak] = 0.2 P[High] = 0.3P[High] = 0.3 P[Failure] = ?P[Failure] = ?

  • ProbabilityProbability

    PP[[Week] = 0.2Week] = 0.2 P[High] = 0.3P[High] = 0.3 P[Failure] = ?P[Failure] = ?

    44.006.05.0][)3.0)(2.0(3.02.0][

    ][][][][][][][][][

    ==

    +=

    +=

    +=

    FailurePFailureP

    HighPWeakPHighPWeakPFailurePHighWeakPHighPWeakPFailureP

    0.2 0,30.06

    W H

    Yes

    -Loa

    d to

    hig

    h

    No

    0.3

    0.7

    0.2 0.80.06

    0.14 0.56

    0.24

    Yes Concrete weak No

    P[Failure]

  • ProbabilityProbability P[Week] = 0.2P[Week] = 0.2 P[High] = 0.3P[High] = 0.3 P[Failure] = ?P[Failure] = ? 44.006.05.0][

    )3.0)(2.0(3.02.0][][][][][][][][][][

    ==

    +=

    +=

    +=

    FailurePFailureP

    HighPWeakPHighPWeakPFailurePHighWeakPHighPWeakPFailureP

    Concrete

    WeakP = 0.2

    OKP = 0.8

    Load Load

    HighP = 0.3

    OKP = 0.7

    HighP = 0.3

    OKP = 0.7

    0.06 0.14 0.24 0.56

    P[Failure] = 0.44

  • ProbabilityProbability

    Class ExampleClass ExampleA lowA low--water bridge is designed to allow for water bridge is designed to allow for flood occurring once every 10 years. flood occurring once every 10 years. Damage occurs during each flood. The Damage occurs during each flood. The bridge is also located in an active seismic bridge is also located in an active seismic region and the probability of a destructive region and the probability of a destructive earthquake occurring in a year is 30%. earthquake occurring in a year is 30%. Determine the probability of damage during Determine the probability of damage during any given year assuming that floods and any given year assuming that floods and earthquakes are statistically independent.earthquakes are statistically independent.

  • ProbabilityProbability

    Class example Class example SolutionSolution P[Flood] = 0.1P[Flood] = 0.1 P[No flood] = 0.9P[No flood] = 0.9 P[Quake] = 0.3P[Quake] = 0.3 P[No quake] = 0.7P[No quake] = 0.7

    37.003.01.03.0][][][][][

    03.01.03.0][

    =+=

    +=

    ==

    FloodQuakePFloodQuakePFloodPQuakePFloodQuakeP

    FloodQuakeP

  • ProbabilityProbability

    Class exampleClass exampleVehicles are classified into light (10 ton). Vehicle counts show that 60% is light, 30% Vehicle counts show that 60% is light, 30% medium and 10% heavy vehicles. Vehicles medium and 10% heavy vehicles. Vehicles are weighed regularly. If it is known that the are weighed regularly. If it is known that the probability that a vehicle is light and probability that a vehicle is light and overloaded is 0.12. Calculate the overloaded is 0.12. Calculate the probability that the next vehicle is heavy or probability that the next vehicle is heavy or that the next vehicle is overloaded. that the next vehicle is overloaded. Assume statically independence.Assume statically independence.

  • ProbabilityProbability

    Class example solutionClass example solution P[light] =0.6P[light] =0.6 P[light and overloaded] = 0.12P[light and overloaded] = 0.12 P[light and overloaded] = P[light and overloaded] =

    P[light]P[overloaded]P[light]P[overloaded] 0.12 = 0.6 x P[overloaded]0.12 = 0.6 x P[overloaded] P[overloaded] = 0.12/0.6 = 0.2P[overloaded] = 0.12/0.6 = 0.2

  • ProbabilityProbability

    Example continueExample continue

    Overloaded Overloaded Overloaded

    Vehicle

    Light0.6

    Medium0.3

    Heavy0.1

    Yes0.2

    Yes0.2

    Yes0.2

    No0.8

    No0.8

    No0.8

    0.12 0.48 0.06 0.24 0.02 0.08

    28.002.01.02.0][][][][][

    =+=

    +=

    HeavyOverloadedPHeavyOverloadedPheavyPoverloadedPHeavyOverloadedP

  • ProbabilityProbability

    Exclusive and IndependentExclusive and Independent Exclusive:Exclusive:

    Events are ones that never occur together.Events are ones that never occur together. IndependentIndependent

    The proportion of times A occurs is the same The proportion of times A occurs is the same whether or not B occurswhether or not B occurs

    Using a Venn diagramUsing a Venn diagram

    w yx

    A B

    z

    A & B mutually exclusiveX = 0

    A & B independentw/x = z/yw & x refer to events in Ay & z refer to events outside ASee paragraph 66 NB!

  • ProbabilityProbability

    Example Example Independent featuresIndependent featuresTotal of 200 students, 80 study Total of 200 students, 80 study biology, 90 study neither biology nor biology, 90 study neither biology nor geography, and the choice of geography, and the choice of whether a student does or does not whether a student does or does not study biology is independent of their study biology is independent of their choice of studying geography.choice of studying geography.Let us use the Venn diagram to Let us use the Venn diagram to determine all required numbers in determine all required numbers in the Venn diagram. the Venn diagram.

  • ProbabilityProbability

    Example solutionExample solution

    w yx

    B G

    z

    z = 90 given

    w+x = 80

    Total outside B = 200-80Total outside B = 120

    Number inside G but outside B = 120 90 = 30y=30

    w/x = z/y90/30 = w/xw=3xw+x = 4x =80x = 20w = 60

  • ProbabilityProbability

    Conditional ProbabilityConditional ProbabilityN amount of tests consist of:N amount of tests consist of:N = nN = n1111+ n+ n2222+ n+ n1212+ n+ n2121

    nn1212 nn2121nn1111

    A B

    nn2222][

    ][]/[

    ][][]/[

    ][

    ][

    ][

    2111

    11

    2111

    11

    1121

    1112

    11

    APBAPABP

    BPBAP

    nnN

    Nn

    nnnBAP

    NnnBP

    NnnAP

    NnBAP

    =

    =

    +

    =

    +=

    +=

    +=

    =

  • ProbabilityProbability

    Class Example: van As & Class Example: van As & JoubertJoubert

    Area 1

    Area 2

    400 vehicles/hour

    600 vehicles/hour

    1% busses

    3% busses

    Calculate the probability that the next vehicle will be a bus

  • ProbabilityProbability

    Example Example SolutionSolution

    978.0022.01][022.0018.0004.0][018.06.003.0][][]/[][004.04.001.0][][]/[][

    ][][]/[

    ][][]/[

    03.0]/[01.0]/[

    6.0][4.0][

    2

    222

    1

    111

    2

    2

    2

    1

    1

    1

    2

    1

    2

    1

    ==

    =+=

    ==

    =

    ==

    =

    =

    =

    =

    =

    =

    =

    MPBP

    ABPAPABPABP

    ABPAPABPABP

    APABPABP

    APABPABP

    ABPABP

    AfromVPAfromVP

    Motor Vehicle Bus

    2

    Are

    a

    1

    0.004

    0.018

  • ProbabilityProbability

    BayesBayes TheoryTheory

    ][][]/[]/[

    ][][]/[

    ][][]/[

    ][][]/[

    BPAPABPBAP

    APABPABP

    BPABPBAP

    BPBAPBAP

    =

    =

    =

    =

    lll

  • ProbabilityProbability

    Class ExampleClass Example

    ReadyReady--mix concrete is delivered at a site. This mix concrete is delivered at a site. This concrete is used to cast columns.concrete is used to cast columns.BB00 Weak concreteWeak concreteBB11 Good concreteGood concreteKK00 Column failColumn failKK11 Column not failingColumn not failingP[BP[B00] = 0.1] = 0.1P[BP[B11] = 0.9] = 0.9P[KP[K00/B/B00] = 0.6] = 0.6P[KP[K11/B/B11] = 0.98] = 0.98P[KP[K00] = ?] = ?P[BP[B00/K/K00] = ?] = ?

  • ProbabilityProbability

    Example SolutionExample Solution

    77.0078.0

    1.06.0][

    ][]/[]/[

    :078.0060.0018.0][

    882.09.098.0][]/[][][

    ][]/[

    06.01.06.0][]/[][][

    ][]/[

    0

    00000

    0

    111111

    1111

    000000

    0000

    =

    ==

    =+=

    ===

    =

    ===

    =

    KPBPBKPKBP

    BayeKP

    BPBKPBKPBPBKPBKP

    BPBKPBKPBP

    BKPBKP

    P[B1]0.9

    P[B0]0.1

    P[K1] P[K0]

    0.882

    0.06

    0.01

    8

    0.04

  • ProbabilityProbability

    Class Example Class Example -- van As & van As & JoubertJoubert

    Gravel from a quarry. 70% of the material Gravel from a quarry. 70% of the material are classified as good material. 20% of the are classified as good material. 20% of the good material fail during testing and 10% of good material fail during testing and 10% of weak material does not fail during testing. weak material does not fail during testing. When testing the material calculate the When testing the material calculate the probability that the material will pass the probability that the material will pass the test. Then calculate the probability the test. Then calculate the probability the material will be good material given that the material will be good material given that the material pass the test.material pass the test.

  • ProbabilityProbability

    Example Example SolutionSolution

    Material

    Good0.7

    Weak0.3

    Test Test

    Fail0.2

    Pass0.8

    Fail0.9

    Pass0.1

    0.14 0.56 0.27 0.03

    P[Pass] = 0.56 +0.03 = 0.59

    949.0]/[03.056.07.08.0]/[

    ][][]/[]/[

    =

    +

    =

    =

    PassGoodP

    PassGoodP

    PassPGoodPGoodPassPPassGoodP

  • ProbabilityProbability

    Combinations and PermutationsCombinations and Permutations Lets start with an example: How many 6 letter Lets start with an example: How many 6 letter

    words can me make with the word GROUND.words can me make with the word GROUND. 11stst letter letter -- 6 choices6 choices 22ndnd letter letter -- 5 choices5 choices 33rdrd letter letter -- 4 choices4 choices 44thth letter letter -- 3 choices3 choices 55thth letter letter -- 2 choices2 choices 66thth letter letter 1 choice1 choice Total = 6 x 5 x 4 x 3 x 2 x 1 = 6!Total = 6 x 5 x 4 x 3 x 2 x 1 = 6!

    How many 3 letter words can we make?How many 3 letter words can we make? 11stst letter letter -- 6 choices6 choices 22ndnd letter letter -- 5 choices5 choices 33rdrd letter letter -- 4 choices4 choices Total = 6 x 5 x 4 = 6!/3!Total = 6 x 5 x 4 = 6!/3!

  • ProbabilityProbability

    PermutationsPermutations nnPPxx is the amount permutations, is the amount permutations,

    consisting of x elements from a consisting of x elements from a set of n elementsset of n elements

    nnPPxx = n!/(n= n!/(n--x)!x)! For permutations the order is For permutations the order is

    important. Therefore GROimportant. Therefore GROORGORG For combinations the order is not For combinations the order is not

    important. Therefore GRO=ORGimportant. Therefore GRO=ORG

  • ProbabilityProbability

    CombinationsCombinations nnCCxx amount of combinations that amount of combinations that

    consist of x elements from a set of n consist of x elements from a set of n elementselements

    If we can make If we can make nnCCxx combinations, for combinations, for each combination we can find x! each combination we can find x! permutations.permutations.

    ((nnCCxx) x! = ) x! = nnPPxx = n!/(n= n!/(n--x)!x)! Therefore (Therefore (nnCCxx) = n!/[(n) = n!/[(n--x)!x!]x)!x!]

    !)!(! Cxn xxnnn

    x =

    =

  • ProbabilityProbability

    ExampleExample National LotteryNational Lottery 49 balls choose 6 balls the order not 49 balls choose 6 balls the order not

    important. (Number of combinations)important. (Number of combinations)

    If you buy 1 ticket every week you will If you buy 1 ticket every week you will win the lotto once in 269 000 yearswin the lotto once in 269 000 years

    649

    61014

    !6)!649(!49

    !)!(!

    =

    =

    =

    xxnnn

    x

  • ProbabilityProbability

    Binominal DistributionBinominal Distribution To introduce it, suppose we have To introduce it, suppose we have

    a sixa six--sided dice, with four sides sided dice, with four sides painted blue and two sides painted blue and two sides painted red. At any roll, there is a painted red. At any roll, there is a oneone--third chance of getting a red third chance of getting a red face and a twoface and a two--thirds chance of thirds chance of getting a blue face.getting a blue face.

    Determine the probability for 3 red Determine the probability for 3 red faces consecutively.faces consecutively.

  • ProbabilityProbability Binominal DistributionBinominal Distribution

    Enables us to Enables us to calculate the calculate the probability of probability of obtaining exactly x obtaining exactly x successes, when the successes, when the probability of success probability of success at any one trail is pat any one trail is p

    ( )( )( )( )( )( ) 037.0

    31

    321

    311]3[

    222.031

    323

    311]2[

    444.031

    323

    311]1[

    296.0321

    311]0[

    1][

    30

    3

    333

    3

    21

    2

    233

    2

    12

    1

    133

    1

    3

    0

    033

    0

    =

    =

    ==

    =

    =

    ==

    =

    =

    ==

    =

    =

    ==

    ==

    pXP

    pXP

    pXP

    pXP

    ppxXP xxnn

    x

    Dice

    R1/3

    R1/3

    R1/3

    B2/3

    B2/3

    B2/3

    RRR = 0.037

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    0.4

    0.45

    0.5

    0 1 2 3

    Amount Red

    Prob

    abili

    ty

  • ProbabilityProbability

    Class ExampleClass ExampleA company manufacture water pipes that A company manufacture water pipes that can resist 500kPa water pressure. It is can resist 500kPa water pressure. It is known that the probability that a pipe will known that the probability that a pipe will burst at 500kPa is 0.05. If 4 water pipes burst at 500kPa is 0.05. If 4 water pipes are selected out of a group of pipes, what are selected out of a group of pipes, what is the probability that:is the probability that:

    1. Exactly 1 pipe will burst1. Exactly 1 pipe will burst2. More than 2 pipes burst2. More than 2 pipes burst3. All 4 pipes will burst3. All 4 pipes will burst4. Any pipe burst4. Any pipe burst

  • ProbabilityProbability Example SolutionExample Solution

    ( )( )( )( ) ( ) ( )( )( ) ( ) ( )[ ]( )( ) ( ) ( )[ ]( )( ) ( ) ( )[ ]( )( ) ( ) ( )[ ] 00001.005.095.0105.005.01]4[

    00048.005.095.0405.005.01]3[

    0135.005.095.0605.005.01]2[

    1715.005.095.0405.005.01]1[

    8145.005.095.0105.005.01]0[

    1][

    404444

    4

    313344

    3

    222244

    2

    131144

    1

    040044

    0

    ====

    ====

    ====

    ====

    ====

    ==

    XP

    XP

    XP

    XP

    XP

    ppxXP xxnn

    x

    1. P[X = 1] = 0.17152. P[X > 2] = 0.00048 + 0.00001 = 0.000493. P[X = 4] = 0.000014. Any pipe burst = 1 P[X = 0] = 1 0.8145 = 0.1855

    Note the sum of P[X = x] = 1

  • ProbabilityProbability Poisson DistributionPoisson Distribution

    ( )

    ( )

    ( ) ( ) ( )

    ( ) ( )

    !][

    1

    11

    111

    1

    1

    !111

    )!(!!

    1)!(!

    !1)!(!

    !

    )(

    1)!(!

    !

    xexXP

    en

    n

    nxnnn

    largenForn

    nxn

    xnnnppxnx

    n

    nnxnxnpp

    xnxn

    valuectedexpenpLetstantconnpwheresmallPandgearlnFor

    ppxnx

    nondistributiBinominal

    x

    n

    x

    x

    x

    n

    x

    xxxn

    xnxxxn

    xxn

    ==

    +

    +=

    =

    ==

    =

  • ProbabilityProbability

    Class ExampleClass ExamplePretoria experience yearly an Pretoria experience yearly an average 2.8 large rain storms average 2.8 large rain storms (> 50mm/hour). Calculate the (> 50mm/hour). Calculate the probability that next year there probability that next year there will be:will be:

    1.1. 112.2. 223.3. 334.4. More than 2, large rain storms. More than 2, large rain storms.

  • ProbabilityProbability

    Class Example Class Example -- SolutionSolution

    ( )

    531.0467.01]2[

    238.017.0!0

    8.21]2[

    ]2[]1[]0[1]2[

    222.0!3

    8.2]3[

    238.0!2

    8.2]2[

    17.0!1

    8.2]1[

    !][

    8.20

    8.23

    8.22

    8.21

    ==>

    ++

    =>

    =+=+==>

    ===

    ===

    ===

    2.8=

    ==

    XP

    eXP

    xPxPxPXP

    eXP

    eXP

    eXP

    xexXP

    x

  • ProbabilityProbability

    Poisson Distribution vs. BinominalPoisson Distribution vs. Binominal

    0

    0.02

    0.04

    0.06

    0.08

    0.1

    0.12

    0.14

    0.16

    0.18

    0.2

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

    n

    P[X=

    x]Poisson Binominal

  • ProbabilityProbability

    Class ExampleClass Example Vehicle counts at a shopping centre Vehicle counts at a shopping centre

    shows there is an average of 120 shows there is an average of 120 vehicles arriving per hour.vehicles arriving per hour. Using first Using first the the Poisson distribution Poisson distribution

    calculate the probability that in the next calculate the probability that in the next minute;minute;

    TThenhen using theusing the Binominal DistributionBinominal Distributiontaking intervals of 2 secondstaking intervals of 2 seconds calculatecalculate the the probability of;probability of;

    1.1. No vehicles arriveNo vehicles arrive2.2. 2 vehicles arrive2 vehicles arrive3.3. At least 2 vehicles arriveAt least 2 vehicles arrive

  • ProbabilityProbability

    Class Example Class Example PoissonPoisson

    594.0]2[])1[]0[(1]2[

    271.0]2[135.0]0[!

    ][

    min/2

    ==+==

    ==

    ==

    ==

    =

    60120

    =

    XPXPXPXP

    XPXP

    xexXP

    vehiclesx

  • ProbabilityProbability

    Class Example Class Example BinominalBinominal Take 2 sec intervals. Therefore n = 30Take 2 sec intervals. Therefore n = 30

    034.0]5[090.0]4[1870.0]3[280.0]2[271.0]1[126.0]0[

    )1()!(!

    !][

    //

    302

    0667.036001202

    ==

    ==

    ==

    ==

    ==

    ==

    ==

    ===

    ==

    XPXPXPXPXPXP

    ppxnx

    nxXP

    minerimentsexpminvehicles

    np

    p

    xnx

  • ProbabilityProbability

    Normal DistributionNormal Distribution

    Y-Ax

    is

    X-Axis

  • ProbabilityProbability Normal DistributionNormal Distribution

    =

    2

    21

    21)(

    x

    exy

    Y-A

    xis

    X-Axis

  • ProbabilityProbability Standard Normal DistributionStandard Normal Distribution

    ( )

    =

    2

    21

    21)(

    z

    ezy

    Y-A

    xis

    Z-Axis

    z )(zy

    -1 1

    )( =

    xz

    21

    0

  • ProbabilityProbability

    Properties of the normal distributionProperties of the normal distribution Continues functionContinues function y(x) is the probability densityy(x) is the probability density Describe completely through 2 parametersDescribe completely through 2 parameters Probability density 0 where x Probability density 0 where x dydy//dxdx = 0 Maximum at = 0 Maximum at y(y()=)=

    Area under function = 1Area under function = 1 Standard Normal DistributionStandard Normal Distribution

    = 0= 0 SD = 1SD = 1 Function symmetricFunction symmetric

    ythenythen

    121

  • ProbabilityProbability

    Normalising/Transform the Normal Normalising/Transform the Normal DistributionDistribution Concrete Normal DistributionConcrete Normal Distribution

    Y-A

    xis

    Z-Axis-1 1

    016.02

    1=

    =

    2550

    1

    xz

    Y-A

    xis

    X-Axis

    2550

    =

    =

    0 50 100

  • ProbabilityProbability

    Normalising/Transform the Normal Normalising/Transform the Normal DistributionDistribution Steel Normal DistributionSteel Normal Distribution

    Y-Ax

    is

    X-Axis

    10200

    =

    =

    180 200 250

    Y-Ax

    is

    X-Axis-1 1

    040.02

    1=

    =

    10200

    1

    xz

  • ProbabilityProbability

    Class ExampleClass Example Make use of the standard normal distribution tableMake use of the standard normal distribution table Calculate the probability for z:Calculate the probability for z:1.1. Smaller than 0Smaller than 02.2. Smaller than 0.8Smaller than 0.83.3. Greater than 0.8Greater than 0.84.4. Greater than 1.2Greater than 1.25.5. Between 0.8 and 1.2Between 0.8 and 1.26.6. Between Between --0.3 and 1.20.3 and 1.2 For what value of z will the probability be to get a value For what value of z will the probability be to get a value

    smaller as zsmaller as z1.1. 0.90.92.2. 0.20.2 What is the probability that a value betweenWhat is the probability that a value between1.1. --1 and 11 and 12.2. --2 and 22 and 23.3. --3 and 3 SD of the mean3 and 3 SD of the mean

  • ProbabilityProbability

    Example SolutionExample Solution ProbabilitiesProbabilities1.1. y(0) = 0.5y(0) = 0.52.2. y(0.8) = 0.78814y(0.8) = 0.788143.3. 11--y(0.8) = 1 y(0.8) = 1 0.78814 = 0.211860.78814 = 0.211864.4. 11--y(1.2) = 1 y(1.2) = 1 0.88493 = 0.115070.88493 = 0.115075.5. y(1.2)y(1.2)--y(0.8) = 0.09679y(0.8) = 0.096796.6. y(1.2)y(1.2)--(1(1--y(0.3)) =0.50284y(0.3)) =0.50284 Z ValuesZ Values1.1. y(z) = 0.9 therefore z = 1.28y(z) = 0.9 therefore z = 1.282.2. y(z) = 0.8 ; z =0.85 therefore z = y(z) = 0.8 ; z =0.85 therefore z = --0.850.85 ProbabilitiesProbabilities1.1. y(0

  • ProbabilityProbability

    Class exampleClass example A large amount of concrete cubA large amount of concrete cubees ws wereere tested. The tested. The

    data set have a normal distribution with a data set have a normal distribution with a =50 =50 MPaMPa and and = 6 = 6 MPaMPa..

    Calculate the probability that a cubCalculate the probability that a cubees strength:s strength:1.1. Larger than 65 Larger than 65 MPaMPa2.2. Less than 38 Less than 38 MPaMPa3.3. Between 45 and 60 Between 45 and 60 MPaMPa4.4. Between 59 and 60 Between 59 and 60 MPaMPa (without table)(without table)5.5. Between 1 standard deviation more than the Between 1 standard deviation more than the

    average and the 3average and the 3rdrd quartilequartile6.6. Calculate the strength to ensure only 5% of all Calculate the strength to ensure only 5% of all

    cubcubees will be less than this strength.s will be less than this strength.7.7. The borders where 80% of the strength will fall into. The borders where 80% of the strength will fall into.

  • ProbabilityProbability

    Class Example Class Example -- SolutionSolution

    Y-A

    xis

    X-Axis

    MPaMPa

    650

    =

    =

    0 50 10044 56

    6

    Y-Ax

    is

    X-Axis-1 1

    066.02

    1=

    =

    650

    1

    xz

    Normal Distribution

    Standard Normal Distribution

  • ProbabilityProbability

    Class ExampleClass Example

    99379.0)

    5.26

    5065.1

    =(=

    =

    z

    z

    Y-Ax

    is

    Z-Axis

    =

    65065

    1z

    Required Area

    00621.0]65[)5.21]65[

    99379.0)5.2.1

    =>

    (=>=(

    xPxP

    cont

  • ProbabilityProbability

    Class Example Class Example -- SolutionSolution

    74927.020327.095254.0]67.1833.0[]6045[20327.0833.01833.0

    833.06

    5045]45[

    95254.0667.1

    667.16

    5060]60[.3

    02275.097725.011

    26

    5038.2

    ====)(=)(=

    =

    =)(=

    =

    ==(2)=(2)=

    =

    zxP

    zxP

    zxP

    z

  • ( )

    ( )

    MPaxx

    zxxz

    xz

    zthanlessatlookingButzz

    P

    zzparpageSee

    xPxy

    AvgMPax

    exyx

    1.40)6(65.150

    65.1%5

    65.195.0).609134.075.084134.0

    84134.0)167.075.0).5

    9620018984.0018984.0)5960(]6059[

    018984.0)59&60(5.59

    1.42

    21

    =

    =

    +=

    =

    =

    =

    ==(==

    =(==(

    ===

    =

    2=

  • ProbabilityProbability

    Y-Ax

    is

    Z-Axis

    10%

    ( )

    68.5732.4268.750

    28.1:68.57

    50)6(28.1

    28.1.0

    =

    +=

    +=

    =

    =

    90=

    x

    Bordersxx

    zx

    xz

    zforz

  • ProbabilityProbability

    Using the cumulative frequency plot Using the cumulative frequency plot to graphically judge whether data is to graphically judge whether data is normally distributednormally distributed

    Arm Length (mm)

    690 700 710 720 730 740 750 760 770 780 790 800 810 820 830 840 850Z co

    rres

    pond

    ing

    to p

    ropo

    rtion

    of o

    bser

    vatio

    n le

    ss th

    an x

    -2.5

    -2.0

    -1.5

    -1.0

    -0.5

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    Straight line indicate a normal distribution

  • ProbabilityProbability

    Expectations of discrete random Expectations of discrete random variables.variables. Let the probability that a random Let the probability that a random

    variable X takes the value x be P[X=x].variable X takes the value x be P[X=x]. The Expectation of X is:The Expectation of X is:

    The summation taking place over all the The summation taking place over all the possible values of X. That is, the possible values of X. That is, the expectation is in effect the mean. expectation is in effect the mean.

    ( ) ( ) == ][ xXPxXE

  • ProbabilityProbability

    Expectation Expectation -- ExampleExample

    0.20.2

    11

    0.10.10.20.20.40.40.10.1P[X=x]P[X=x]

    44332200xx

    ( ) ( ) ( ) ( ) ( )

    2)(4.06.08.02.00)(

    1.042.034.022.011.00)(

    =

    ++++=

    ++++=

    xExExE

  • ProbabilityProbability

    Expectations of continuous random Expectations of continuous random variables. Thenvariables. Then

    ExampleExample= dxxxfxE )()(

    32)(

    61

    21)(

    211)(

    0

    20211)(

    2

    0

    32

    2

    0

    =

    =

    =

    =

    =

    =

    xE

    xxxE

    dxxxxE

    isitxotherfor

    andbetweenxxf

    x

    x

  • ProbabilityProbability

    Variances and standard deviations of Variances and standard deviations of discrete random variables.discrete random variables. The variance of a random variable X, is The variance of a random variable X, is

    the expectation of [Xthe expectation of [X--E(X)]E(X)]22, that is., that is.

    ( )[ ] ( )[ ] ( )[ ] ( )[ ] ( )[ ]

    1.1)(

    2.1)(4.02.002.04.0)(

    1.022.014.002.011.02)(

    ][)]([)(

    22222

    2

    ==

    =

    ++++=

    ++++=

    ==

    XV

    XVXVXV

    datapreviousfromExamplexXPXExXV

  • ProbabilityProbability

    Variances and standard deviations of Variances and standard deviations of continuous random variables.continuous random variables.ThenThen

    [ ]

    47.0)(92)(

    94

    97

    95

    81)(

    211

    32)(

    211)(

    )()()(

    2

    0

    234

    2

    0

    2

    2

    ==

    =

    ++=

    =

    =

    =

    =

    =

    XV

    XV

    xxxxXV

    dxxxXV

    xxfwhereexampleourFor

    dxxfXExXV

    x

    x

  • ProbabilityProbability Example Example IQIQ The following table provide information The following table provide information

    presenting a group of students IQ levels presenting a group of students IQ levels related to the different IQ levels related to the different IQ levels probabilitiesprobabilities

    0.010.011401400.020.021301300.080.081201200.220.221101100.330.331001000.220.2290900.090.0980800.020.0270700.010.016060

    P[X=x]P[X=x]IQ LevelIQ Level

  • ProbabilityProbability

    IQ Test IQ Test -- ExampleExample

    IQ Level P[X=x] E(X=x)-P[X=x]x V(X) Probability Cumulative Probability z60 0.01 0.6 15.8404 0.0004 0.004 0.004 -2.6570 0.02 1.4 17.7608 0.0025 0.025 0.029 -1.980 0.09 7.2 35.2836 0.0100 0.100 0.129 -1.1390 0.22 19.8 21.1288 0.0228 0.228 0.357 -0.37

    100 0.33 33 0.0132 0.0297 0.298 0.654 0.4110 0.22 24.2 22.8888 0.0223 0.223 0.877 1.16120 0.08 9.6 32.6432 0.0096 0.096 0.973 1.93130 0.02 2.6 18.2408 0.0024 0.024 0.997 2.65140 0.01 1.4 16.1604 0.0003 0.003 1.000

    Total 99.8 179.96 0.0999

    E(X)=mean 99.80V(X) 179.96Std Deviation 13.41

    ( )2

    211

    2=

    x

    exy

  • ProbabilityProbability

    IQ

    40 60 80 100 120 140 160

    P[X

    =x]

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

  • ProbabilityProbability

    IQ

    60 70 80 90 100 110 120 130 140

    P[X=

    x]

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

  • ProbabilityProbability

    IQ

    40 60 80 100 120 140 160

    P[X