53
STAT509: Discrete Random Variable Peijie Hou University of South Carolina September 16, 2014 Peijie Hou STAT509: Discrete Random Variable

STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

STAT509: Discrete Random Variable

Peijie Hou

University of South Carolina

September 16, 2014

Peijie Hou STAT509: Discrete Random Variable

Page 2: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Motivation

I So far, we have already known how to calculateprobabilities of events.

I Suppose we toss a fair coin three times, we know that theprobability of getting three heads in a row is 1/8.However, people maybe not interested in the probabilitydirectly related to the sample space, instead, somenumerical summaries are of our interest.

I Suppose we are interested in the number of heads in aexperiment of tossing a fair coin three times. The possiblevalues are 0, 1, 2, and 3. The corresponding probabilitiescan be calculated.

Peijie Hou STAT509: Discrete Random Variable

Page 3: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Random Variable

DefinitionLet S be the sample space of a random experiment. Arandom variable defined on S is a function from S into thereals. We usually denote random variables by the capitalletters X ,Y , etc. In mathematical notation,

X : S → R.

The set of possible distinct values of the random variable iscalled its range. A realized value of the random variable isusually denoted by the lower-case letters x , y ,etc.

Peijie Hou STAT509: Discrete Random Variable

Page 4: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Type of Random Variables

I A discrete random variable can take one of a countablelist of distinct values. Usually, counts are discrete.

I A continuous random variable can take any value in aninterval of the real number line. Usually, measurementsare continuous.

I Classify the following random variables as discrete orcontinuous

I Time until a projectile returns to earth.I The number of times a transistor in computer memory

changes state in one operation.I The volume of gasoline that is lost to evaporation during

the filling of a gas tank.I The outside diameter of a machined shaft.

Peijie Hou STAT509: Discrete Random Variable

Page 5: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example of Discrete Random Variable

I Consider toss a fair coin 10 times. The sample space Scontains total 210 = 1024 elements, which is of the form

S = {TTTTTTTTTT , . . . ,HHHHHHHHHH}

I Define the random variable Y as the number of tails outof 10 trials. Remeber that a random variable is a mapfrom sample space to real number. For instance,

Y (TTTTTTTTTT ) = 10.

I The range (all possible values) of Y is {0, 1, 2, 3, . . . , 10},which is much smaller than S .

Peijie Hou STAT509: Discrete Random Variable

Page 6: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Mechanical Components

I An assembly consists of three mechanical components.Suppose that the probabilities that the first, second, andthird components meet specifications are 0.90, 0.95 and0.99 respectively. Assume the components areindependent.

I Define eventSi = the i th component is within specification, wherei = 1, 2, 3.

I One can calculateP(S1S2S3) = (0.9)(0.95)(0.99) = 0.84645,P(S1S2S3) = (0.9)(0.05)(0.01) = 0.00045, etc.

Peijie Hou STAT509: Discrete Random Variable

Page 7: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Mechanical Components

Possible Outcomes for One assembly is

S1S2S3 S1S2S3 S1S2S3 S1S2S3 S1S2S3 S1S2S3 S1S2S3 S1S2S30.84645 0.00045 0.00095 0.00495 0.00855 0.04455 0.09405 0.00005

Let Y = Number of components within specification in arandomly chosen assembly. Can you fill in the following table?

Y 0 1 2 3

P(Y = y)

Solution:

Peijie Hou STAT509: Discrete Random Variable

Page 8: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Probability Mass Functions of Discrete Variables

I Definition: Let X be a discrete random variable definedon some sample space S . The probability mass function(pmf) associated with X is defined to be

pX (x) = P(X = x).

I A pmf p(x) for a discrete random variable X satisfies thefollowing:

1. 0 ≤ p(x) ≤ 1, for all possible values of x .2. The sum of the probabilities, taken over all possible values

of X , must equal 1; i.e.,∑all x

p(x) = 1.

Peijie Hou STAT509: Discrete Random Variable

Page 9: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Cumulative Distribution Function

I Definition: Let Y be a random variable, the cumulativedistribution function (CDF) of Y is defined as

FY (y) ≡ P(Y ≤ y).

I F (y) ≡ P(Y ≤ y) is read, “the probability that therandom variable Y is less than or equal to the value y .”

I Property of cumulative distribution function

1. F (y) is a nondecreasing function.2. 0 ≤ FY (y) ≤ 1, since FY (y) = P(Y ≤ y) is a probability.

Peijie Hou STAT509: Discrete Random Variable

Page 10: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Mechanical Components Revisited

I In the Mechanical Components example Y = Number ofcomponents within specification in a randomly chosenassembly. We already filled in the following table:

Y 0 1 2 3

P(Y = y) 0.00005 0.00635 0.14715 0.84645

I The pmf and CDF plots of Y

Peijie Hou STAT509: Discrete Random Variable

Page 11: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Expected Value of a Random Variable

I Definition: Let Y be a discrete random variable with pmfpY (y), the expected value or expectation of Y isdefined as

µ = E (Y ) =∑all y

ypY (y).

The expected value for a discrete random variable Y issimply a weighted average of the possible values of Y .Each value y is weighted by its probability pY (y).

I In statistical applications, µ = E (Y ) is commonly calledthe population mean.

Peijie Hou STAT509: Discrete Random Variable

Page 12: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Expected Value: Mechanical Components

I The pmf of Y in the mechanical components example is

Y 0 1 2 3

P(Y = y) 0.00005 0.00635 0.14715 0.84645

I The expected value of Y is

µ = E (Y ) =∑all y

ypY (y)

= 0(.00005) + 1(.00635) + 2(.14715) + 3(.84645)

= 2.84

I Interpretation: On average, we would expect 2.84components within specification in a randomly chosenassembly.

Peijie Hou STAT509: Discrete Random Variable

Page 13: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Expected Value as the Long-run Average

I The expected value can be interpreted as the long-runaverage.

I For the mechanical components example, if we randomlychoose an assembly, and record the number ofcomponents within specification. Over the long run, theaverage of these Y observation would be close(converge) to 2.84.

Peijie Hou STAT509: Discrete Random Variable

Page 14: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Expected Value of Functions of Y

I Definition: Let Y be a discrete random variable with pmfpY (y). Let g be a real-valued function defined on therange of Y . The expected value or expectation ofg(Y ) is defined as

E [g(Y )] =∑all y

g(y)PY (y).

I Interpretation: The expectation E [g(Y )] could be viewedas the weighted average of the function g when evaluatedat the random variable Y .

Peijie Hou STAT509: Discrete Random Variable

Page 15: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Properties of Expectation

Let Y be a discrete random variable with pmf pY (y). Supposethat g1, g2, . . . , gk are real-valued function, and let c be a realconstant. The expectation of Y satisfies the followingproperties:

1. E [c] = c .

2. E [cY ] = cE [Y ].

3. Linearity: E[∑k

j=1 gj(Y )]

=∑k

j=1 E [gj(Y )].

Remark: The 2nd and 3rd rule together imply that

E

k∑j=1

cjgj(Y )

=k∑

j=1

cjE [gj(Y )],

for constant cj .

Peijie Hou STAT509: Discrete Random Variable

Page 16: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Mechanical Components

Question: In the Mechanical Components example,Y = Number of components within specification in a randomlychosen assembly.We calculated the expected value of Y E (Y ) = 2.84. Supposethat the cost (in dollars) to repair the assembly is given by thecost function g(Y ) = (3− Y )2.What is the expected cost to repair an assembly?

Solution: In order to use the linearity property, expanding thesquare term gives g(Y ) = 9− 6Y + Y 2 = Y 2 − 6Y + 9. Theexpected cost

E [g(Y )] = E [Y 2 − 6Y + 9] = E [Y 2]− 6E [Y ] + 9.

Peijie Hou STAT509: Discrete Random Variable

Page 17: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Mechanical Components

The expected value of Y 2 is

E [Y 2] =∑all y

y2pY (y)

= 02(.00005) + 12(.00635) + 22(.14715) + 32(.84645)

= 8.213

It follows that the average cost in dollars equals

E [g(Y )] = 8.213− 6(2.84) + 9 = 0.173.

Interpretation: One average, the repair cost on each assemblyis $0.173.

Peijie Hou STAT509: Discrete Random Variable

Page 18: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Project Management

A project manager for an engineering firm submitted bids onthree projects. The following table summarizes the firmschances of having the three projects accepted.

Project A B C

Prob of accept 0.30 0.80 0.10

Question: Assuming the projects are independent of oneanother, what is the probability that the firm will have all threeprojects accepted?Solution:

Peijie Hou STAT509: Discrete Random Variable

Page 19: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Project Management Cont’d

Question: What is the probability of having at least oneproject accepted?

Project A B C

Prob of accept 0.30 0.80 0.10

Solution:

Peijie Hou STAT509: Discrete Random Variable

Page 20: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Project Management Cont’d

Question: Let Y = number of projects accepted. Fill in thefollowing table and calculate the expected number of projectsaccepted.

Y 0 1 2 3

pY (y) 0.126

Solution:

Peijie Hou STAT509: Discrete Random Variable

Page 21: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Variance of a Random Variable

I Definition: Let Y be a discrete random variable with pmfpY (y) and expected value E (Y ) = µ. The variance ofY is given via

σ2 ≡ Var (Y ) ≡ E[(Y − µ)2

]=

∑all y

(y − µ)2pY (y).

Warning: Variance is always non-negative!

I Definition: The standard deviation of Y is the positivesquare root of the variance:

σ =√σ2 =

√Var (Y ).

Peijie Hou STAT509: Discrete Random Variable

Page 22: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Remarks on Variance (σ2)

Suppose Y is a random variable with mean µ and variance σ2.

I The variance is the average squared distance from themean µ.

I The variance of a random variable Y is a measure ofdispersion or scatter in the possible values for Y .

I The larger (smaller) σ2 is, the more (less) spread in thepopssible values of Y about the population mean E (Y ).

I Computing Formula:Var (Y ) = E [(Y − µ)2] = E [Y 2]− [E (Y )]2.

Peijie Hou STAT509: Discrete Random Variable

Page 23: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Properties on Variance (σ2)

Suppose X and Y are random variable with finite variance. Letc be a constant.

I Var (c) = 0.

I Var [cY ] = c2Var [Y ].

I Suppose X and Y are independent, then

Var [X + Y ] = Var [X ] + Var [Y ].

Question: Var [X − Y ] = Var [X ]−Var [Y ].

I TRUE

I FALSE.

Peijie Hou STAT509: Discrete Random Variable

Page 24: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Bernoulli Trials

Definition: Many experiments can be envisioned as consistingof a sequence of “trials,” these trails are called Bernoulli trailsif

1. Each trial results in only two possible outcomes, labelledas “success” and “failure”.

2. The trials are independent.

3. The probability of a success in each trial, denoted as p,remains constant.

Peijie Hou STAT509: Discrete Random Variable

Page 25: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Bernoulli Trials

I By convention, a success is recoded as “1,” and a failureas “0.”

I Define X to be the Bernoulli random variable

I X = 1 represents “success”; X = 0 represents “failure”I Suppose P(X = 1) = p and P(X = 0) = 1− p, then

E (X ) = p and Var(X ) = pq

Proof:

Peijie Hou STAT509: Discrete Random Variable

Page 26: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Examples of Bernoulli Trials

I Each sample of water has a 10% chance of containing aparticular organic pollutant. Assume that the samples areindependent with regard to the presence of the pollutant.

X Detecting water pollutant=“trial”X polluted water is observed=“success”X p = P(“success”)=P(polluted water)=0.1.

I Ninety-eight percent of all air traffic radar signals arecorrectly interpreted the first time they are transmitted.

X radar signal = “trial”X signal is correctly interpreted=“success”X p = P(“success”)=P(correct interpretation)=0.98.

Peijie Hou STAT509: Discrete Random Variable

Page 27: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Binomial Distribution

I Definition: Suppose that n Bernoulli trials are performed.Define

Y = the number of successes (out of n trials performed).

We say that the random variable Y has a binomialdistribution with number of trials n and successprobability p.

I Shorthand notation is Y ∼ b(n, p).

I Let us derive the pmf of Y through an example.

Peijie Hou STAT509: Discrete Random Variable

Page 28: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Water Filters

I A manufacturer of water filters for refrigerators monitorsthe process for defective filters. Historically, this processaverages 5% defective filters. Five filters are randomlyselected.

I Find the probability that no filters are defective.solution:

I Find the probability that exactly 1 filter is defective.solution:

I Find the probability that exactly 2 filter is defective.solution:

I Can you see the pattern?

Peijie Hou STAT509: Discrete Random Variable

Page 29: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Probability Mass Function of Binomial R.V.

I Suppose Y ∼ b(n, p).

I The pmf of Y is given by

p(y) =

{(ny

)py (1− p)n−y for y = 0, 1, 2, . . . , n,

0 otherwise.

I Recall that(nr

)is the number of ways to choose r distinct

unordered objects from n distinct objects:(n

r

)=

n!

r !(n − r)!.

Peijie Hou STAT509: Discrete Random Variable

Page 30: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Mean and Variance of Binomial R.V.

I If Y ∼ b(n, p), then

E (Y ) = np

Var (Y ) = np(1− p).

I (Optional) I will derive above formulas on the blackboard.Proof:

Peijie Hou STAT509: Discrete Random Variable

Page 31: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Radon Levels

Question: Historically, 10% of homes in Florida have radonlevels higher than that recommended by the EPA. In a randomsample of 20 homes, find the probability that exactly 3 haveradon levels higher than the EPA recommendation. Assumehomes are independent of one another.

Solution: Checking each house is a Bernoulli trial, whichsatisfiesX Two outcomes: higher than EPA recommendation or

satisfies EPA recommendation.

X Homes are independent of one another.

X The case that the radon level is higher than therecommendation is considered as a “success”. Thesuccess probability remains 10%.

Peijie Hou STAT509: Discrete Random Variable

Page 32: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Radon Levels Cont’d

So, the binomial distribution is applicable in this example.Define

Y = number of home having radon level higher than EPA.

We have Y ∼ b(20, 0.1).

P(Y = 3) =

(20

3

)0.130.920−3 = 0.1901.

Doing calculation by R is much easier:

> dbinom(3,20,0.1)

[1] 0.1901199

Peijie Hou STAT509: Discrete Random Variable

Page 33: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Radon Levels Cont’d

You can also calculate the probability that at least 6 homes outof the sample having higher radon level than recommended:

P[Y ≥ 6] = 1− P[Y ≤ 5].

Using R,

> 1-pbinom(5,20,0.1)

[1] 0.01125313

Peijie Hou STAT509: Discrete Random Variable

Page 34: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

It’s Your Turn...

If a manufacturing process has a 0.03 defective rate, what isthe probability that at least one of the next 25 units inspectedwill be defective?

(a)(251

)(0.03)1(0.97)24

(b) 1-(251

)(0.03)1(0.97)24

(c) 1-(250

)(0.03)0(0.97)25

(d) 1-(250

)(0.03)25(0.97)0

Peijie Hou STAT509: Discrete Random Variable

Page 35: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Question

A manufacturing process has a 0.03 defective rate. If werandomly sample 25 units

(a) What is the probability that less than 6 will be defective?

(b) What is the probability that 4 or more are defective?

(c) What is the probability that between 2 and 5, inclusive,are defective?

Peijie Hou STAT509: Discrete Random Variable

Page 36: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Geometric Distribution

Setting: Suppose that Bernoulli trials are continually observed.Define

Y = the number of trials to observe the first success.

Then we say that Y has a geometric distribution with successprobability p, where p is the success probability of Bernoullitrials.

I Shorthand notation is Y ∼ geom(p).

I The pmf of Y is

P(Y = y) = p(1− p)y−1 x = 1, 2, . . .

Peijie Hou STAT509: Discrete Random Variable

Page 37: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Geometric Distribution

The probability that a wafer contains a large particle ofcontamination is 0.01. If it is assumed that the wafers areindependent, what is the probability that exactly 125 wafersneed to be analysed until a large particle is detected?

Solution: Let X denote the number of samples analysed until alarge particle is detected. Then X is a geometric randomvariable with p = 0.01. The requested probability is

P(X = 125) = (0.01)(0.99)125−1 = 0.0029

Peijie Hou STAT509: Discrete Random Variable

Page 38: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Mean and Variance of Geometric Distribution

Suppose Y ∼ geom(p), the expected value of Y is

E (Y ) =1

p.

Proof.Let q = 1− p, E (Y ) =

∑∞y=1 yq

y−1p = p∑∞

y=1 yqy−1. The

right-hand side is the derivative of p∑∞

y=1 qy w.r.t. q. By

geometric series,

p∞∑y=1

qy =pq

1− q.

It follows that E (Y ) = ∂∂q

[pq1−q

]= 1

p

Peijie Hou STAT509: Discrete Random Variable

Page 39: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Mean and Variance of Geometric Distribution

The Variance can be derived in a similar way, which is given by

Var (Y ) =1− p

p2

Peijie Hou STAT509: Discrete Random Variable

Page 40: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Golf Resort

The probability that a wafer contains a large particle ofcontamination is 0.01. If it is assumed that the wafers areindependent, then on average, how many wafers need to beanalysed until a large particle is detected?

Solution:

Peijie Hou STAT509: Discrete Random Variable

Page 41: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Hypergeometric Distribution

Setting: A set of N objects contains r objects classified assuccesses N − r objects classified as failures. A sample of sizen objects is selected randomly (without replacement) fromthe N objects. Define

Y = the number of success (out of the n selected).

We say that Y has a hypergeometric distribution and writeY ∼ hyper(N, n, r). The pmf of Y is given by

p(y) =

(ry)(N−r

n−y)(Nn)

, y ≤ r and n − y ≤ N − r

0, otherwise.

Peijie Hou STAT509: Discrete Random Variable

Page 42: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Mean and Variance of Hypergeometric Distribution

If Y ∼ hyper(N, n, r), then

I E (Y ) = n( rN )

I Var (Y ) = n( rN )(N−rN )(N−nN−1 ).

Peijie Hou STAT509: Discrete Random Variable

Page 43: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Hypergeometric Distribution

A batch of parts contains 100 parts from a local supplier oftubing and 200 parts from a supplier of tubing in the nextstate. If four parts are selected randomly and withoutreplacement, what is the probability they are all from the localsupplier?

Solution: Let us identify the parameters first. In this example,N = , r = , and n = .Let X equal the number of parts in the sample from the localsupplier. Then,

P(X = 4) =

(1004

)(2000

)(3004

) = 0.0119

Peijie Hou STAT509: Discrete Random Variable

Page 44: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Example: Hypergeometric Distribution

The R code to calculate P(Y = y) is of the formdhyper(y,r,N-r,n). Using R,

> dhyper(4,100,200,4)

[1] 0.01185408

What is the probability that two or more parts in the sampleare from the local supplier?

P(X ≥ 2) = 1− P(X ≤ 1) = 1− 0.5925943,

where P(X ≤ 1) can be computed via R:

> phyper(1,100,200,4)

[1] 0.5925943

Peijie Hou STAT509: Discrete Random Variable

Page 45: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

It’s your turn...

If a shipment of 100 generators contains 5 faulty generators,what is the probability that we can select 10 generators fromthe shipment and not get a faulty one?

(a)(51)(959 )(10010 )

(b)(50)(959 )(1009 )

(c)(50)(9510)(10010 )

(d)( 510)(950 )(10010 )

Peijie Hou STAT509: Discrete Random Variable

Page 46: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Insulated Wire

I Consider a process that produces insulated copper wire.Historically the process has averaged 2.6 breaks in theinsulation per 1000 meters of wire. We want to find theprobability that 1000 meters of wire will have 1 or fewerbreaks in insulation?

I Is this a binomial problem?

I Is this a hypergeometric problem?

Peijie Hou STAT509: Discrete Random Variable

Page 47: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Poisson Distribution

Note: The Poisson distribution is commonly used to modelcounts, such as

1. the number of customers entering a post office in a givenhour

2. the number of α-particles discharged from a radioactivesubstance in one second

3. the number of machine breakdowns per month

4. the number of insurance claims received per day

5. the number of defects on a piece of raw material.

Peijie Hou STAT509: Discrete Random Variable

Page 48: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Poisson Distribution Cont’d

I Poisson distribution can be used to model the number ofevents occurring in a continuous time or space.

I Let λ be the average number of occurrences per baseunit and t is the number of base units inspected.

I Let Y = the number of “occurrences” over in a unitinterval of time (or space). Suppose Poisson distributionis adequate to describe Y . Then, the pmf of Y is given by

pY (y) =

{(λt)y e−λt

y ! , y = 0, 1, 2, . . .

0, otherwise.

I The shorthand notation is X ∼ Poisson(λt).

Peijie Hou STAT509: Discrete Random Variable

Page 49: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Mean and Variance of Poisson Distribution

If Y ∼ Poisson(λt),

E (Y ) = λt

Var (Y ) = λt.

Peijie Hou STAT509: Discrete Random Variable

Page 50: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Go Back to Insulated Wire

I “Historically the process has averaged 2.6 breaks in theinsulation per 1000 meters of wire” implies that theaverage number of occurrences λ = 2.6, and the baseunits is 1000 meters.

I Let X= number of breaks 1000 meters of wire will have.X ∼ Poisson(2.6).

I We have

P(Y = 0 ∪ Y = 1) =2.60e−2.6

0!+

2.61e−2.6

1!.

I Using R,

> dpois(0,2.6)+dpois(1,2.6)

[1] 0.2673849

Peijie Hou STAT509: Discrete Random Variable

Page 51: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Go Back to Insulated Wire Cont’d

I If we were inspecting 2000 meters of wire,λt = (2.6)(2) = 5.2.

Y ∼ Poisson(5.2).

I If we were inspecting 500 meters of wire,λt = (2.6)(0.5) = 1.3.

Y ∼ Poisson(1.3)

.

Peijie Hou STAT509: Discrete Random Variable

Page 52: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Conditions for a Poisson Distribution

I Areas of inspection are independent of one another.

I The probability of the event occurring at any particularpoint in space or time is negligible.

I The mean remains constant over all areas of inspection.

The pmf of Poisson distribution is derived based on these threeassumption!http://www.pp.rhul.ac.uk/~cowan/stat/notes/

PoissonNote.pdf

Peijie Hou STAT509: Discrete Random Variable

Page 53: STAT509: Discrete Random Variablepeople.stat.sc.edu/.../notes/DiscreteRandomVariable.pdfRandom Variable De nition Let S be the sample space of a random experiment. A random variable

Questions

1. Suppose we average 5 radioactive particles passing acounter in 1 millisecond. What is the probability thatexactly 3 particles will pass in one millisecond?Solution:

2. Suppose we average 5 radioactive particles passing acounter in 1 millisecond. What is the probability thatexactly 10 particles will pass in three milliseconds?Solution:

Peijie Hou STAT509: Discrete Random Variable