29
Binomial Random Variables

Binomial Random Variables. Binomial experiment A sequence of n trials (called Bernoulli trials), each of which results in either a “success” or a “failure”

  • View
    223

  • Download
    1

Embed Size (px)

Citation preview

Binomial Random Variables

Binomial experiment

• A sequence of n trials (called Bernoulli trials), each of which results in either a “success” or a “failure”.

• The trials are independent and so the probability of success, p, remains the same for each trial.

• Define a random variable Y as the number of successes observed during the n trials.

• What is the probability p(y), for y = 0, 1, …, n ?• How many successes may we expect? E(Y) = ?

Returning Students

• Suppose the retention rate for a school indicates the probability a freshman returns for their sophmore year is 0.65. Among 12 randomly selected freshman, what is the probability 8 of them return to school next year?

Each student either returns or doesn’t. Think of each selected student as a trial, so n = 12.

If we consider “student returns” to be a success, then p = 0.65.

12 trials, 8 successes

• To find the probability of this event, consider the probability for just one sample point in the event.

• For example, the probability the first 8 students return and the last 4 don’t.

• Since independent, we just multiply the probabilities:

1 2 8 9 10 11 12

1 2 8 9 12

8 4

(( , , , , , , , , , , , ))

( )

( ) ( ) ( ) ( ) ( )

(0.65) (1 0.65)

P S S S S S S S S F F F F

P R R R R R R R

P R P R P R P R P R

12 trials, 8 successes

• For the probability of this event, we sum the probabilities for each sample point in the event.

• How many sample points are in this event?• How many ways can 8 successes and 4 failures occur?

12 4 128 4 8, or simply C C C

• Each of these sample points has the same probability. • Hence, summing these probabilities yields

12 8 48

P(8 successes in trials)

= (0.65) (0.35) 0.237

n

C

Binomial Probability Function

• A random variable has a binomial distribution with parameters n and p if its probability function is given by

p( ) (1 )n y n yyy C p p

Rats!

• In a research study, rats are injected with a drug. The probability that a rat will die from the drug before the experiment is over is 0.16. Ten rats are injected with the drug.

What is the probability that at least 8 will survive?

Would you be surprised if at least 5 died during the experiment?

Quality Control

• For parts machined by a particular lathe, on average, 95% of the parts are within the acceptable tolerance.

• If 20 parts are checked, what is the probability that at least 18 are acceptable?

• If 20 parts are checked, what is the probability that at most 18 are acceptable?

Binomial Theorem

• As we saw in our Discrete class, the Binomial Theorem allows us to expand

• As a result, summing the binomial probabilities, where q = 1- p is the probability of a failure,

0

( )n

n n y n yy

y

p q C p q

0

( ) (1 ) ( (1 )) 1n

n y n y ny

y y

P Y y C p p p p

Mean and Variance

• If Y is a binomial random variable with parameters n and p, the expected value and variance for Y are given by

( ) and ( ) (1 )E Y n p V Y n p p

Deriving Expected Value

0 1

( ) ( ) ( )( ), where 1n n

n y n yy

y y

E Y y p y y C p q q p

1

!( )( )

!( )!

ny n y

y

ny p q

y n y

1

!( )( )

( 1)!( )!

ny n y

y

np q

y n y

1

1

( 1)!( )( )

( 1)!( )!

ny n y

y

nn p p q

y n y

When y = 0, the summand is zero. Just as well start at y = 1.

And deriving…

1

1

( 1)!( ) ( )( )

( 1)!( )!

ny n y

y

nE Y n p p q

y n y

1 1 ( 1)

1

( 1)!( )( )

( 1)!( 1 ( 1))!

ny n y

y

nn p p q

y n y

1 1 1 ( 1)1

1

( )( )n

n y n yy

y

n p C p q

1

1 1

0

( )( )n

n z n zz

z

n p C p q

1( )nn p p q n p

=1

DerivingVariance?

2Shows that [ ( 1)] ( 1)E Y Y n n p Just the highlights (see page 104 for details).

2 2

2

and so E(Y ) [ ] ( ) [ ( 1)] ( )

( 1)

E Y Y E Y E Y Y E Y

n n p np

2 2

2 2

2 2 2 2 2

Thus, ( ) E(Y ) [ ( )]

[ ( 1) ] ( )

V Y E Y

n n p np np

n p np np n p

2

(1 )

np np

np p npq

“fairly common trick” to use E[Y(Y-1)] to find E(Y2)

Rats!

• In a research study, rats are injected with a drug. The probability that a rat will die from the drug before the experiment is over is 0.16. Ten rats are injected with the drug.

• How many of the rats are expected to survive?

• Find the variance for the number of survivors.

Geometric Random Variables

Your 1st Success

• Similar to the binomial experiment, we consider:• A sequence of independent Bernoulli trials.• The probability of “success” equals p on each trial.• Define a random variable Y as the number of the

trial on which the 1st success occurs. (Stop the trials after the first success occurs.)

• What is the probability p(y), for y = 1,2, … ?• On which trial is the first success expected?

Finding the probability

• Consider the values of Y:y = 1: (S)y = 2: (F, S)y = 3: (F, F, S)y = 4: (F, F, F, S)and so on…

S

S

SF

FS

….

F

(F, S)

(F, F, S)

(S)

(F, F, F, S)p(1) = pp(2) = (q)( p)p(3) = (q2)( p)p(4) = (q3)( p)

Geometric Probability Function

• A random variable has a geometric distribution with parameter p if its probability function is given by

1 p( )

where 1 , for 1,2,...

yy q p

q p y

Success?

• Of course, you need to be clear on what you consider a “success”.

• For example, the 1st success might mean finding the 1st defective item!

D

D

DG

G

G

(G, D)

(G, G, D)

(D)

Geometric Mean, Variance

• If Y is a geometric random variable with parameter p the expected value and variance for Y are given by

2

1 1( ) and ( )

pE Y V Y

p p

Deriving the Mean

1

1 1

( ) ( ) ( )( ), where 1n

y

y y

E Y y p y y q p q p

1 2 3

1

(1 2 3 4 )n

y

y

p yq p q q q

2 3 4( )d

p q q q qdq

1

d qp

dq q

2

1 1

(1 )p

q p

Deriving Variance

1

1 2

[ ( 1)] ( 1) ( ) ( 1)( )( )n

y

y y

E Y Y y y p y y y q p

2 2

2

( 1) 2 6 12n

y

y

pq y y q pq q q

2

2 3 42

2

2 1

dpq q q q q

d q

d qpq

d q q

Using the “trick” of finding E[Y(Y-1)] to get E(Y2)…

2

2q

p

Deriving VarianceNow, forming the second moment, E(Y2)…

22

2( ) [ ( 1)] [ ]

q pE Y E Y Y E Y

p

22

2

2 2

( ) ( ) ( )

2 1 1

V Y E Y E Y

q p p

p p p

And so, we find the variance…

At least ‘a’ trials? (#3.55)

• For a geometric random variable and a > 0,show P(Y > a) = qa

• Consider P(Y > a) = 1 – P(Y < a)

= 1 – p(1 + q + q2 + …+ qa-1)

= qa , based on the sum of a geometric series

At least b more trials?

• Based on the result, it follows P(Y > a + b) = qa+b

• Also, the conditional probabilityP(Y > a + b | Y > a ) = qb = P(Y > b)

“the memoryless property”

No Memory?

• For the geometric distribution P(Y > a + b | Y > a ) = qb = P(Y > b)

• This implies P(Y > 7 | Y > 2 ) = q5 = P(Y > 5)“knowing the first two trials were failures, the probability a success won’t occur on the next 5 trials”

as compared to “just starting the trials and a success won’t occur on the first 5 trials” same probability?!

Estimating p (example 3.13)

• Considering implementing a new policy in a large company, so we ask employees whether or not they favor the new policy.

• Suppose the first four reject the new policy, but the 5th individual is in favor of the policy.

What does this tell us about the percentage of employees we might expect to favor the policy?Can we estimate the probability p of getting a favorable vote on any given trial?

What value of p is most likely?

• We wish to find the value of p which would make it highly probable that the 5th individual turns out to be first “success”.

• That is, let’s maximize the probability of finding the first success on trial 5, where p(5) = (1- p)4 p

• For what value of p is this probability a max?

Find the Extrema

4 3(1 ) (1 ) (1 5 )d

p p p pdp

The derivative is zero and the probability is at its maximum when p = 0.2

Using the derivative to locate the maximum

“the method of maximum likelihood”