(6) Random Variables and PMF

Embed Size (px)

Citation preview

  • 7/30/2019 (6) Random Variables and PMF

    1/15

    Applied Statistics and Computing Lab

    RANDOM VARIABLES AND PROBABILITY

    MASS FUNCTIONApplied Statistics and Computing Lab

    Indian School of Business

  • 7/30/2019 (6) Random Variables and PMF

    2/15

    Applied Statistics and Computing Lab

    Learning Goals

    To understand the concept of Random

    Variable

    Types of Random Variable

    Concept of PMF, CDF

    Expectation and variance of a Random

    Variable Properties of Expectation of Random Variable

    2

  • 7/30/2019 (6) Random Variables and PMF

    3/15

    Applied Statistics and Computing Lab

    Random Experiments and Random Variable:

    Concept Random Experiments:

    1. There is a lot of 50 manufacturing items out of which 10 are defective. A random sample of size 4is drawn where the items are drawn together at one time, so that the order is not important.

    2. One observes how many times the BSE Sensex goes up and down and remains the same in aweek.

    Questions of interest:

    Number of defective items in the random sample of 4 items drawn

    Number of times Sensex has gone up in the week

    Sample space of random experiment 1 : { NNNN, DNNN ,DDNN, DDDN, DDDD}

    where N= non-defective item drawn, D= Defective item drawn

    Note: Here since items are drawn at a time, the order doesnt matter. For example,

    {DNNN,NDNN,NNDN,NNND} all represent the same event.

    Sample space of random experiment 2: {UUUUU ,UUUUD, UUUDU, UUDUU, UDUUU,DUUUU,UUUDD, UUDDU, UDDUU, DDUUU, DUDUU, DUUDU, DUUUD, UDUDU, UDUUD,UUDUD,DDDUU, DDUUD, DUUDD, UUDDD, UDUDD, UDDUD, UDDDU, DUDUD, DUDDU,DDUDU,DDDDU, DDDUD, DDUDD, DUDDD, UDDDD,DDDDD}

    Where U= event that the stock price goes up

    D= event that stock price remains same or goes down

    (Note: Stock prices reported for 5 days in a week)

    3

  • 7/30/2019 (6) Random Variables and PMF

    4/15

    Applied Statistics and Computing Lab

    Random Variable: Definition Let X represent a quantitative variable that is measured or observed in an experiment

    For random experiment 1, X= Number of defective items in the random sample of 4 items drawn

    X can take values 0,1,2,3,4

    For random experiment 2, X=Number of times Sensex has gone up in the week X can take values 0,1,2,3,4,5

    For random experiment 1 we list out the values of X and the corresponding events

    Values of X Corresponding Events

    X=0 NNNN

    X=1 DNNN

    X=2 DDNN When an experiment is conducted, we know from

    X=3 DDDN beforehand the range of values that X can assume, but

    X=4 DDDD the actual outcome that will materialize is unknown

    We see, corresponding to every point in the sample space we have an unique value for X

    X though can correspond to a group of sample points (In this example, there is a one-one correspondence

    between the values of X and corresponding events)- Check that for the Sensex example, we have more

    than one sample points corresponding to a particular value of a random variable!

    Thus, we see that this X defines a function from the sample space to the real line.

    X is called a Random Variable.

    Thus, Random variable is defined as a function from the sample space to the real line.

    Key points to note:

    Function of sample space

    Correspondence between points in the sample space and the values of a random variable

    Random Variable thus partitions the sample space into mutually exclusive and exhaustive events.

    4

  • 7/30/2019 (6) Random Variables and PMF

    5/15

    Applied Statistics and Computing Lab

    Probability Distribution of a Random Variable Corresponding to each value of the random variable, we have a set of sample points and hence a

    particular probability of occurrence of that value of the random variable

    Thus, we have a natural probability assignment to the values taken by a random variable

    A statement of all possible values of a random variable together with the corresponding probabilitiesgives the probability distribution of the random variable

    Probability of outcome of each point in the sample space occurring=

    Value of random variable Corresponding Events Probability

    X=0 NNNN 40C4 *10C0/

    50C4 = 91390 /230300= 0.39683

    X=1 DNNN 10C1*40C3/

    50C4 = 98800 /230300= 0.429006

    X=2 DDNN 10C2*40C2/

    50C4= 35100 / 230300= 0.15241

    X=3 DDDN 10C3*40C1/

    50C4= 4800 /230300= 0.020842

    X=4 DDDD 10C4*40C0/

    50C4=210/230300= 0.000912

    So we have the probability distribution for a discrete random variable

    The probability distribution of a discrete random variable X (For definition check slides 6,7) must satisfythe

    following two conditions- a) for all x

    b)

    The probability distribution of a discrete random variable is called the probability mass function

    To check if a function f(x)= P(X=x) is a pmf, check if conditions a and b are satisfied.

    Henceforth X is used to represent the random variable in question and x the particular value it takes

    In this example, they are satisfied! (Check)

    5

    ( ) 0p x

    ( ) 1all x

    p x

  • 7/30/2019 (6) Random Variables and PMF

    6/15

    Applied Statistics and Computing Lab

    Types of Random Variable

    Discrete Random Variable: When the observations of a random variable can take on only a finite

    number of values or a countably infinite number of values then it is discrete random variable

    Concept of countably infinite number: Let the random variable X be the number of throws of a die till

    the first six appears. Then X can take any values 1,2,3,4..In theory, infinite number of possibilities

    for the values of x. The set of values of x corresponds to the set of counting natural numbers.

    Therefore, this type of infinity is called countable.

    Continuous random variable: When the observations of a random variable can take on any countlessnumber of values in a line interval, then it is continuous random variable

    Examples: Discrete or continuous?

    Weight of a boy measured in kgs Continuous because weight can take any real value

    The number of bad checks drawn Discrete because number of checks can only be

    at Bank A on a day selected at whole number

    random Decay time for a radioactive particle Time can take any value, so continuous

    Number of wells an oil prospector Discrete because this corresponds to counting the

    drills until the first productive well set of natural numbers

    is found

    6

  • 7/30/2019 (6) Random Variables and PMF

    7/15Applied Statistics and Computing Lab

    Discrete or Continuous?

    7

    Can you count the number of cubes, the number of cars in the left panel?

    Can you measure the weight, the time and the scale in the right panel? For all practical purposes, the values of continuous random variables can be measured

    (at least in theory) to any degree of accuracy while the values taken by discrete random

    variable can be counted

    Visuals from AczelSounderpandian, Complete Business Statistics

  • 7/30/2019 (6) Random Variables and PMF

    8/15Applied Statistics and Computing Lab

    Cumulative Distribution Function of a Discrete

    Random Variable The probability distribution of a discrete random variable lists the probabilities of

    occurrence of different values of the random variable. We may be interested in

    cumulative probabilities of the random variable. That is, we may be interested in-

    The probability that the value of the random variable is at most some value x. This

    is the sum of all the probabilities of the values i of X that are less than or equal to x

    We ask the question- what is the probability that at most 0,1,2,3,4 items are

    defective out of the sample of 4 items? Obviously, for discrete random variable, you obtain the cumulative probabilities by

    adding individual probabilities

    8

    Value of Random Variable (X) Probability Cumulative

    Probability

    0 91390 /230300 91390/230300

    1 98800 /230300 190190/230300

    2 35100 / 230300 225290/230300

    3 4800 /230300 230090/230300

    4 210/230300 230300/230300

  • 7/30/2019 (6) Random Variables and PMF

    9/15Applied Statistics and Computing Lab

    Definition: Cumulative Distribution

    Function

    The cumulative distribution function, F(x), of adiscrete random variable X is

    F(x) = P(X x)=

    All cumulative distribution functions arenondecreasing and equals 1.00 at the largestpossible value of the random variable- Theprobability that the values of the random variableare less than or equal to the largest possiblevalue is 1 by definition!

    9

    xiall

    )(iP

  • 7/30/2019 (6) Random Variables and PMF

    10/15Applied Statistics and Computing Lab

    Expectation and variance of a Random

    variable

    The mean of a probability distribution of a random variable is called theexpected value of the random variable

    The reason for this name is that the mean is the (probability-weighted)average value of the random variable, and therefore it is the value weexpect to occur

    , for discrete random variable

    Variance of a random variable is the expected squared deviation of therandom variable from its mean( Expectation). The idea is similar to that of thevariance of a data set. Probabilities of the values of the random variable areused as weights in the computation of the squared deviation from the mean ofa discrete random variable. The definition of the variance follows.

    The variance of a discrete random variable X is given byVar(X)=

    Computational formula for the variance of a random variable

    Var(X)=

    Check that the two expressions for variance are equal

    xall

    )()( xxpxE

    )())(())((Xall

    22XpXEXXEXE

    2 2( ) ( )E X E X

    10

  • 7/30/2019 (6) Random Variables and PMF

    11/15Applied Statistics and Computing Lab

    Illustration: Computation of Mean

    We rewrite table 1:

    E(X)= =(0*0.39683+ 1*0.429006+2*0.15241+3*.020842+4*0.000912)

    = .8

    Therefore, .8 is the expected number of defective items in a lot of 4.

    xall

    )()( xxpxE

    11

    Value of X Probability

    0 0.39683

    1 0.429006

    2 0.15241

    3 0.020842

    4 0.000912

  • 7/30/2019 (6) Random Variables and PMF

    12/15Applied Statistics and Computing Lab

    Illustration: Computation of variance

    Value of X Probability X2

    0 0.39683 0

    1 0.429006 1

    2 0.15241 4

    3 0.020842 9

    4 0.000912 16

    Use computational formula for variance ( easier to obtain)

    Var(X)=

    = 4.78924

    Var(X)= 4.78924 (.8)2 = 4.14924

    Standard deviation (X)= sqrt(4.14924)= 2.036968

    2 2( ) ( )E X E X2 2

    all x

    ( ) ( )E X x p x

    12

  • 7/30/2019 (6) Random Variables and PMF

    13/15Applied Statistics and Computing Lab

    An example: Properties of RV Consider a book seller. At the beginning of the month he buys each book for 100 Rs and sells

    them at 120 Rs. Also, he has to incur a fixed monthly cost of 100 Rs towards maintenance of

    his store, regardless of the books sold. The number of books sold is a random variable. But he

    has to take a decision regarding how many books to buy at the beginning of the month from

    his supplier. He is interested in the following questions-

    How many books to order?

    What is his expected profit per month?

    What is the standard deviation of the expected profit?

    As you can guess, he will order the expected value of the books that he can sell. He has the

    following data on purchase pattern from his shop-

    13

    Number of books sold (X) Probability

    5 1/5

    10 1/10

    15 2/5

    20 1/10

    25 1/5

    The expected number of books sold=15. He places the order for 15 books

    His expected revenue= E(120x- 100x-100)= E(20X-100).How to calculate this? It would be very easy if E(20x-100) could be written= 20E(X)-100

    We investigate some properties of functions of random variables

  • 7/30/2019 (6) Random Variables and PMF

    14/15Applied Statistics and Computing Lab

    Some properties of Expectation of a

    Random Variable

    The expected value of a function of a random variable can be computed asfollows-

    Let h(X) be a function of the discrete random variable X.

    The expected value of h(X), a function of the discrete random variable X, is

    E(h(X))=

    The function h(X) could be X,2X, 3X, X2, log(X) or any function (Incomputing variance we will use this property with h(X)= X2 )

    The expected value of a linear function of a random variable is

    E(aX + b)= aE(X) +b, where a and b are fixed numbers.

    An useful property of variance: Variance of a linear function of a random

    variable is V (aX + b)= a2

    V(X), where a and b are fixed numbers. As you can see, you can use the last two properties to compute the

    booksellers expected revenue and the expected variation in the revenue

    Expected Revenue= 20E(X)-100= 20*15-100= 200

    Expected variation in Revenue=(20)2 *V(X)= 400*45= 18,000

    Expected standard deviation= sqrt(18,000)= 134.16

    all x

    ( ) ( )h x p x

    14

  • 7/30/2019 (6) Random Variables and PMF

    15/15Applied Statistics and Computing Lab

    Thank you

    Applied Statistics and Computing Lab