4. Probability Distribution

Embed Size (px)

Citation preview

  • 8/7/2019 4. Probability Distribution

    1/49

    1

    Random Variables and

    Probability Distribution

    Purnomo

    Jurusan Teknik MesinUGM

  • 8/7/2019 4. Probability Distribution

    2/49

    2

    Random VariablesA random variable X is a numerical valued

    function defined on a sample space.

    A number X(e), providing a measure ofcharacteristic of interest, is assigned toeach simple event e in the sample

    spaceContoh dadu :X = 1, 2, 3, 4, 5, 6

  • 8/7/2019 4. Probability Distribution

    3/49

    3

    Two balls are drawn in succession from a box that contains 4red balls and 3 blue balls. The possible outcomes and the valuesy of the random variable Y, where Y is the number of red balls

    is

    Sample space y

    RR 2

    RB 1

    BR 1

    BB 0

  • 8/7/2019 4. Probability Distribution

    4/49

    4

    IllustrationTwo products A and B are judge by four consumer who then

    expressed a preference for A and B. The outcome when the firstand third consumers prefer A and the other consumers prefer B

    is denoted by ABAB. The number of outcomes is 24 = 16.

    AAAA AAAB AABB ABBB BBBB

    AABA ABAB BABB

    ABAA ABBA BBAB

    BAAA BAAB BBBA

    BBAA

    BABA

  • 8/7/2019 4. Probability Distribution

    5/49

    5

    IllustrationSuppose that the products are alike in quality and that the

    consumers express their preference independently. Then the 16simple events in the sample space are equally likely, and each

    has a probability of 1/16. Let a random variable X be devined asX= number of person preffering A to B.Probability distribution :

    Distinct value of X 0 1 2 3 4

    Probability 1/16 4/16 6/16 4/16 1/16

    P[X2] = 6/16 + 4/16 + 1/16 = 11/16P[1X3] = 4/16 + 6/16 + 4/16 = 14/16

  • 8/7/2019 4. Probability Distribution

    6/49

    6

    Probability Distribution

    The probability distribution or simply, the distribution ofa discrete random variable is a list of the distinctvalues of xi together with their associate probabilitiesf(xi) = P[X=xi]

  • 8/7/2019 4. Probability Distribution

    7/49

    7

    Graphic PresentationLine diagram

    x

    f(x)

    0 1 2 3 4

    2/16

    4/16

    6/16

  • 8/7/2019 4. Probability Distribution

    8/49

    8

    Histogram of probability Histogram

    Value x 1 2 3 4

    f(x) 1/8 1/8

    1 2 3 4 5

    2/8

    4/8Area = 0.5

  • 8/7/2019 4. Probability Distribution

    9/49

    9

    Properties of relative frequency histogram

    The total area under the histogram is 1

    For the two points a and b such that each is a

    boundary point of some class, the relativefrequency of measurements in the interval ato b is the area under the histogram enclosedby this interval.

  • 8/7/2019 4. Probability Distribution

    10/49

    10

    ExpectationExpected value or Expectation of X

    )()( ii xfxXE !

    X 0 1 2 3 4 5 Totalf(x) 0.1 0.1 0.2 0.3 0.2 0.1 1

    xf(X) 0 0.1 0.4 0.9 0.8 0.5 2.7

    E(X) = 2.7

    E(X) = population mean =

  • 8/7/2019 4. Probability Distribution

    11/49

    11

    Variance : a measure of spreadDeviation = X

    (x1 ), (x2 ), .(xk )

    Probabilitiesf(x1), f(x2), .f(xk)

    E(deviation) = E(X ) = (xi )f(xi) = 0

    Deviation can not be used as a measure of spread

  • 8/7/2019 4. Probability Distribution

    12/49

    12

    Variance and standard

    Deviation

    Variance of X (= 2 = x2)

    Var(X) = E[(X )2] = E(X2) 2

    Standard deviation ( = = x )

    sd(X) = Var (X)

  • 8/7/2019 4. Probability Distribution

    13/49

    13

    Standardized random variable

    Standardized random variable :

    Random variable Z has a mean of 0 andvariance of 1

    k

    kX

    ZW

    Q! has E(Z) = 0 and Var(Z) = 1

    Bentuk ini akan banyak digunakan pada applikasi

  • 8/7/2019 4. Probability Distribution

    14/49

    14

    PROBABILITY MODELS FOR CONTINUOUSRANDOM VARIABLES

    The probability distribution of acontinuous random variable can bevisualized as a smooth form of relativefrequency histogram based on largenumber of observations.

  • 8/7/2019 4. Probability Distribution

    15/49

    15

    PROBABILITY DENSITY CURVE

    Probability density curve can be viewedas a limiting form of relative frequencyhistogram(number of classes - infinite )

    P2

  • 8/7/2019 4. Probability Distribution

    16/49

    Slide 15

    P2 PURNOMO, 8/23/2006

  • 8/7/2019 4. Probability Distribution

    17/49

    16

    Properties of Probability Density Function, f(x)

    The total area under the density curve is 1

    area under thedensity curve between a and b

    f(x) is positive or zero

    For continuous random variable, the

    probability that X=x is always 0 (X is onlymeaningful when X lies in an interval

    ? A!ee bXaP

  • 8/7/2019 4. Probability Distribution

    18/49

    17

    Density Curves Measuring center and spread for density

    curves

    Density curves describe the overall shape of adistribution

    Ideal patterns that are accurate enough forpractical purposes

    Faster to draw and easier to use

    Areas or proportions under the curverepresent counts or percents of observations

  • 8/7/2019 4. Probability Distribution

    19/49

    18

    Features of a Continuous Distribution

    As with relative frequency histograms, the probabilitydensity curves of continuous random variables posses

    a wide variety of shapes :- Negatively skewed

    - Symmetric

    - Positively skewed

    - Flat- Bell shaped

    - Peaked

  • 8/7/2019 4. Probability Distribution

    20/49

    19

    Center of a Density Curve The mode of a distribution is the point where

    the curve is highest

    The median is the point where half of thearea under the curve lies on the left and theother half on the right. Equal Areas Point

    Quartiles can be found by dividing the area

    under the curve into four equal parts of the area is to the left of the 1stquartile of the area is to the left of the 3rd quartile

    The mean is the balance point.4

    3

  • 8/7/2019 4. Probability Distribution

    21/49

    20

    Percentiles

    Percentiles are defined as :

    The population 100p-th percentile is an x value that has

    an area p to the left and 1-p to the right.

    Lower (first) quartile = 25th percentile

    Second quartile (or median) = 50th percentile

    Upper (third) quartile=75th percentile

  • 8/7/2019 4. Probability Distribution

    22/49

    21

    The Normal distribution Discovered by Abraham de Moivre around 1720. Around 1870,

    Adolph Quetelet realised that the normal curve could be used tocompare histograms of data.

    Chest measurements of 5738 Scottish soldiers by Belgianscholar Lambert Quetelet (1796-1874)

    Pierre Laplace dan Carl Gauss : bell-shaped distribution

    Gauss derived the normal distribution mathematically as theprobability distribution of the error of measurements, which is

    called normal law of error Gaussian Distribution

  • 8/7/2019 4. Probability Distribution

    23/49

    22

    Normal Distributions Symmetric Single-peaked (unimodal) Bell-shaped The mean, median, and mode are the same The points where there is a change in

    curvature is one standard deviation on eitherside of the mean.

    The mean and standard deviation completelyspecify the curve

  • 8/7/2019 4. Probability Distribution

    24/49

    23

    Normal Distribution

    The height of a normaldensity curve at any pointxis given by

    2)(

    2

    1

    2

    1)( W

    Q

    TW

    !x

    exf

    is the mean

    is the standard deviation

    Q

    W

    Q

    W

    ),( WQN

  • 8/7/2019 4. Probability Distribution

    25/49

  • 8/7/2019 4. Probability Distribution

    26/49

    25

    The Empirical Rule 68% of the observations fall within one

    standard deviation of the mean

    95% of the observations fall within twostandard deviation of the mean

    99.7% of the observations fall withinthree standard deviation of the mean

  • 8/7/2019 4. Probability Distribution

    27/49

    26

    Example: Young Womens

    Height The heights of young women are approximately

    normal with mean = 64.5 inches and std.dev. = 2.5

    inches.

  • 8/7/2019 4. Probability Distribution

    28/49

    27

    The normal distribution is the most important distribution

    in Statistics. Typical normal curves with different sigma

    (standard deviation) values are shown below.

  • 8/7/2019 4. Probability Distribution

    29/49

    28

    Examples with approximate

    Normal distributions Height

    Weight

    IQ scores

    Standardized test scores

    Body temperature Repeated measurement of same

    quantity

  • 8/7/2019 4. Probability Distribution

    30/49

    29

    FACTS Universality of the normal distribution is

    only a myth, and examples of quitenonnormal distribution abound in anyvirtually every field of study

    Still, the normal distribution plays a

    central role in statistics (make thingseasier)

  • 8/7/2019 4. Probability Distribution

    31/49

    30

    Standardizing and z-Scores One case, one curve --- too

    complicated

    Solution -- standardization

    normalization

    non-dimensionalization

    ---- z-Scores

    ---- All cases, one curve (or table)

  • 8/7/2019 4. Probability Distribution

    32/49

    31

    Standardizing and z-Scores an observation x comes from a distribution with

    mean and standard deviation The standardized value ofx is defined as

    which is also called az-

    sco

    re. A z-score indicates how many standard deviations

    the original observation is away from the mean,and in which direction.

    ,

    W

    !

    xz

  • 8/7/2019 4. Probability Distribution

    33/49

    32

    The Standard Normal Curve

    N(0,1)

  • 8/7/2019 4. Probability Distribution

    34/49

  • 8/7/2019 4. Probability Distribution

    35/49

    34

    The Standard Normal Table The Normal Table is a table of areas under the

    standard normal density curve. The table entry for eachvalue z is the area under the curve to the left ofz.

  • 8/7/2019 4. Probability Distribution

    36/49

    35

    The Standard Normal Table The Normal Table can be used to find the proportion of

    observations of a variable which fall to the left of a specificvalue z if the variable follows a normal distribution.

  • 8/7/2019 4. Probability Distribution

    37/49

    36

  • 8/7/2019 4. Probability Distribution

    38/49

  • 8/7/2019 4. Probability Distribution

    39/49

    38

  • 8/7/2019 4. Probability Distribution

    40/49

    39

    Use of The Normal TableArea under curve to the left of z

    (area to the left of b)- (area to theleft of a)

    ? A !e zZP

    ? A!ee bzaP

  • 8/7/2019 4. Probability Distribution

    41/49

  • 8/7/2019 4. Probability Distribution

    42/49

    41

    Use of The Normal Table If z>0

    ? A ? A

    ? A ? AzZPzZP

    zZPzZP

    e!e

    e!e

    05.0

    05.0

  • 8/7/2019 4. Probability Distribution

    43/49

    42

    Use of The Normal Table Calculate z

    Find the area to the left of z in StandardNormal Probability Table

    Other calculations obey the propertiesof the Standardized Normal Curve

  • 8/7/2019 4. Probability Distribution

    44/49

    43

    Example : random variable

  • 8/7/2019 4. Probability Distribution

    45/49

    44

    Example : Expectation

  • 8/7/2019 4. Probability Distribution

    46/49

    45

  • 8/7/2019 4. Probability Distribution

    47/49

    46

  • 8/7/2019 4. Probability Distribution

    48/49

    47

  • 8/7/2019 4. Probability Distribution

    49/49

    48