39
Statistical Applications px n x p p x nx () ( ) 1 Binominal and Poisson’s Probability distributions .10 .20 .30 .40 0 1 2 3 4 E E ( ( x x ) = ) = = = xf xf ( ( x x ) ) f x e x x () !

Statistical Applications Binominal and Poisson’s Probability distributions.10.20.30.40 0 1 2 3 4 E ( x ) = = xf ( x )

Embed Size (px)

Citation preview

Statistical Applications

p xn

xp px n x( ) ( )

1

Binominal and Poisson’s Probability distributions

.10.10

.20.20

.30.30

.40.40

0 1 2 3 4 0 1 2 3 4

EE((xx) = ) = = = xfxf((xx))EE((xx) = ) = = = xfxf((xx))

f xex

x( )

!

f x

ex

x( )

!

Learning Objectives

Evaluate discrete probability distributions from realistic data

Use the binominal distribution to evaluate simple probabilities

Evaluate probabilities using the Poisson’s distribution

Answer examination type question pertaining to these distributions

After the session the students should be able to:

Recap- Types of data

• Discrete (A variable controlled by a fixed set of values)• Continuous data (A variable measured on a continuous scale )• These data may be collected (ungrouped) and then grouped

together in particular form so that can be easily inspected • But how would we collect data?

Simple random sampling

Stratified sampling

Cluster sampling

Quota sampling

Systematic sampling

Mechanical sampling

Convenience sampling

Recap: Sampling Techniques

Random Variables

A A random variablerandom variable is a numerical description of the is a numerical description of the outcome of an experiment.outcome of an experiment. A A random variablerandom variable is a numerical description of the is a numerical description of the outcome of an experiment.outcome of an experiment.

A A discrete random variablediscrete random variable may assume either a may assume either a finite number of values or an infinite sequence offinite number of values or an infinite sequence of values.values.

A A discrete random variablediscrete random variable may assume either a may assume either a finite number of values or an infinite sequence offinite number of values or an infinite sequence of values.values.

A A continuous random variablecontinuous random variable may assume any may assume any numerical value in an interval or collection ofnumerical value in an interval or collection of intervals.intervals.

A A continuous random variablecontinuous random variable may assume any may assume any numerical value in an interval or collection ofnumerical value in an interval or collection of intervals.intervals.

Examples

QuestionQuestion Random Variable Random Variable xx TypeType

FamilyFamilysizesize

xx = Number of dependents = Number of dependents reported on tax returnreported on tax return

DiscreteDiscrete

Distance fromDistance fromhome to storehome to store

xx = Distance in miles from = Distance in miles from home to the store sitehome to the store site

ContinuousContinuous

Own dogOwn dogor cator cat

xx = 1 if own no pet; = 1 if own no pet; = 2 if own dog(s) only; = 2 if own dog(s) only; = 3 if own cat(s) only; = 3 if own cat(s) only; = 4 if own dog(s) and cat(s)= 4 if own dog(s) and cat(s)

DiscreteDiscrete

Discrete Probability Distributions

The The probability distributionprobability distribution for a random variable for a random variable describes how probabilities are distributed overdescribes how probabilities are distributed over the values of the random variable.the values of the random variable.

The The probability distributionprobability distribution for a random variable for a random variable describes how probabilities are distributed overdescribes how probabilities are distributed over the values of the random variable.the values of the random variable.

We can describe a discrete probability distributionWe can describe a discrete probability distribution with a table, graph, or equation.with a table, graph, or equation. We can describe a discrete probability distributionWe can describe a discrete probability distribution with a table, graph, or equation.with a table, graph, or equation.

Discrete Probability Distributions cont…

The probability distribution is defined by aThe probability distribution is defined by a probability functionprobability function, denoted by , denoted by ff((xx), which provides), which provides the probability for each value of the random variable.the probability for each value of the random variable.

The probability distribution is defined by aThe probability distribution is defined by a probability functionprobability function, denoted by , denoted by ff((xx), which provides), which provides the probability for each value of the random variable.the probability for each value of the random variable.

The required conditions for a discrete probabilityThe required conditions for a discrete probability function are:function are: The required conditions for a discrete probabilityThe required conditions for a discrete probability function are:function are:

ff((xx) ) >> 0 0ff((xx) ) >> 0 0

ff((xx) = 1) = 1ff((xx) = 1) = 1

Discrete Uniform Probability Distribution

The The discrete uniform probability functiondiscrete uniform probability function is is The The discrete uniform probability functiondiscrete uniform probability function is is

ff((xx) = 1/) = 1/nnff((xx) = 1/) = 1/nn

where:where:nn = the number of values the random = the number of values the random variable may assumevariable may assume

the values of the values of thethe

random random variablevariable

are equally are equally likelylikely

Relative frequency

Say a shop uses past knowledge to produce a tabular representation of the probability distribution for TV sales:

NumberNumber Units SoldUnits Sold of Daysof Days

00 80 80 11 50 50 22 40 40 33 10 10 44 20 20

200200

xx ff((xx)) 00 .40 .40 11 .25 .25 22 .20 .20 33 .05 .05 44 .10 .10

1.001.00

80/20080/200

Mean Value

The or mean or expected value, of a random variable

is a measure of its central location.

expected number expected number of TVs sold in a dayof TVs sold in a day

xx ff((xx)) xfxf((xx))

00 .40 .40 .00 .00

11 .25 .25 .25 .25

22 .20 .20 .40 .40

33 .05 .05 .15 .15

44 .10 .10 .40.40

EE((xx) = 1.20) = 1.20

EE((xx) = ) = = = xfxf((xx))EE((xx) = ) = = = xfxf((xx))

Mean cont…

• Graphical Representation of Probability Distribution

.10.10

.20.20

.30.30

.40.40

.50.50

0 1 2 3 40 1 2 3 4Values of Random Variable Values of Random Variable xx (TV sales) (TV sales)Values of Random Variable Values of Random Variable xx (TV sales) (TV sales)

Pro

babili

tyPro

babili

tyPro

babili

tyPro

babili

ty

Variance & Standard deviation

The variance summarizes the variability in the values of a random variable.

Var(Var(xx) = ) = 22 = = ((xx - - ))22ff((xx))Var(Var(xx) = ) = 22 = = ((xx - - ))22ff((xx))

NOTE: The NOTE: The standard deviationstandard deviation, , , is defined as the positive, is defined as the positive square root of the variance.square root of the variance. NOTE: The NOTE: The standard deviationstandard deviation, , , is defined as the positive, is defined as the positive square root of the variance.square root of the variance.

Binomial Distribution

• Four Properties of a Binomial Experiment

3. The probability of a success, denoted by 3. The probability of a success, denoted by pp, does, does not change from trial to trial.not change from trial to trial.3. The probability of a success, denoted by 3. The probability of a success, denoted by pp, does, does not change from trial to trial.not change from trial to trial.

4. The trials are independent.4. The trials are independent.4. The trials are independent.4. The trials are independent.

2. Two outcomes, 2. Two outcomes, successsuccess and and failurefailure, are possible, are possible on each trial.on each trial.2. Two outcomes, 2. Two outcomes, successsuccess and and failurefailure, are possible, are possible on each trial.on each trial.

1. The experiment consists of a sequence of 1. The experiment consists of a sequence of nn identical trials.identical trials.1. The experiment consists of a sequence of 1. The experiment consists of a sequence of nn identical trials.identical trials.

stationarystationaryassumptioassumptio

nn

• Of interest is the number of success occurring in n trials

• Let x be the number of successes

Binomial Probability Function

( )!( ) (1 )

!( )!x n xn

f x p px n x

( )!( ) (1 )

!( )!x n xn

f x p px n x

where:where: ff((xx) = the probability of ) = the probability of xx successes in successes in nn trials trials nn = the number of trials = the number of trials pp = the probability of success on any one trial = the probability of success on any one trial

Jacob Bernoulli

Binomial Probability Function cont…

• Evaluation of probabilities using the distribution function:

( )!( ) (1 )

!( )!x n xn

f x p px n x

( )!( ) (1 )

!( )!x n xn

f x p px n x

!!( )!

nx n x

!!( )!

nx n x

( )(1 )x n xp p ( )(1 )x n xp p

Probability of a particularProbability of a particular sequence of trial outcomessequence of trial outcomes with x successes in with x successes in nn trials trials

Probability of a particularProbability of a particular sequence of trial outcomessequence of trial outcomes with x successes in with x successes in nn trials trials

Number of experimentalNumber of experimental outcomes providing exactlyoutcomes providing exactly

xx successes in successes in nn trials trials

Number of experimentalNumber of experimental outcomes providing exactlyoutcomes providing exactly

xx successes in successes in nn trials trials

Binomial probability function alternative notation

• Evaluation of probabilities using the distribution function:

No of combinations

No of combinations

Notice the pattern of numbers

Notice the pattern of numbers

Probability of a particularProbability of a particular sequence of trial outcomessequence of trial outcomes with x successes in with x successes in nn trials trials

Probability of a particularProbability of a particular sequence of trial outcomessequence of trial outcomes with x successes in with x successes in nn trials trials

Number of experimentalNumber of experimental outcomes providing exactlyoutcomes providing exactly

xx successes in successes in nn trials trials

Number of experimentalNumber of experimental outcomes providing exactlyoutcomes providing exactly

xx successes in successes in nn trials trials

rnrr

n ppCrXP )1(

Mean and Variance

It is useful to note that for a binominal distribution the following are valid:

(1 )np p (1 )np p

EE((xx) = ) = = = npnp

Var(Var(xx) = ) = 22 = = npnp(1 (1 pp))

Expected Value

Variance

Standard Deviation

Example #1 :

Evans is concerned about a low retention rate for employees. In recent years, management has seen a turnover of 10% of the hourly employees annually. Thus, for any hourly employee chosen at random, management estimates a probability of 0.1 that the person will not be with the company next year.

Solution:

Using the Binomial Probability Function Choosing 3 hourly employees at random,

what is the probability that 1 of them will leave the company this year?

f xn

x n xp px n x( )

!!( )!

( )( )

1f xn

x n xp px n x( )

!!( )!

( )( )

1

1 23!(1) (0.1) (0.9) 3(.1)(.81) .243

1!(3 1)!f

1 23!

(1) (0.1) (0.9) 3(.1)(.81) .2431!(3 1)!

f

LetLet: p: p = 0.1, = 0.1, nn = 3, = 3, xx = 1 = 1

Exercise #1

A milling machine is know to produce 9% defective components, if a random sample of 5 components are taken, evaluate the probability of no more than 2 components being defective

Exercise #1: Solution

• Find the required parameters, namely:• p=0.09• n=5• X<3• Here you will need to use a little intelligence, i.e.:

pXXP )5,09.0Bi(|3

)Bi(0.09,5)2()Bi(0.09,5)1(

)Bi(0.09,5)0(

XXPXXP

XXPp

Now put the numbers into: f xn

x n xp px n x( )

!!( )!

( )( )

1f xn

x n xp px n x( )

!!( )!

( )( )

1

Exercise #1: Solution cont…

Using this standard formula gives

50 81.009.0)!5(!0

!5)0( XP

f xn

x n xp px n x( )

!!( )!

( )( )

1f xn

x n xp px n x( )

!!( )!

( )( )

1

Notice the pattern:Top and bottom equal the top4+1=5

41 81.009.0)!4(!1

!5)1( XP

586.0

043.01937.0349.0)2()1()0(

XPXPXP

32 81.009.0)!3(!2

!5)2( XP

Further example

A machine produces on average 1 defective parts out of 8. 5 samples are collected from this machine. Find the probability that 2 of them are defective.

pXXP )5,8/1Bi(|3

Solution:5 2 32( 2) (0.125) (1 0.125)

5!(0.015625)(0.669921875) 0.10475

2!3!

P X C

Notice here the new

nomenclature C2

5

Poisson’s distribution This distributions is named

after the famous French mathematician who formulated it:

Siméon Denis Poisson

A Poisson distributed random variable is oftenA Poisson distributed random variable is often useful in estimating the number of occurrencesuseful in estimating the number of occurrences over a over a specified interval of time or spacespecified interval of time or space

A Poisson distributed random variable is oftenA Poisson distributed random variable is often useful in estimating the number of occurrencesuseful in estimating the number of occurrences over a over a specified interval of time or spacespecified interval of time or space

It is a discrete random variable that may assumeIt is a discrete random variable that may assume an an infinite sequence of valuesinfinite sequence of values (x = 0, 1, 2, . . . ). (x = 0, 1, 2, . . . ). It is a discrete random variable that may assumeIt is a discrete random variable that may assume an an infinite sequence of valuesinfinite sequence of values (x = 0, 1, 2, . . . ). (x = 0, 1, 2, . . . ).

Poisson’s random variables

They can be time dependent or not!

Examples of a Poisson distributed random variable:Examples of a Poisson distributed random variable: Examples of a Poisson distributed random variable:Examples of a Poisson distributed random variable:

the number of knotholes in 14 linear feet ofthe number of knotholes in 14 linear feet of pine boardpine board the number of knotholes in 14 linear feet ofthe number of knotholes in 14 linear feet of pine boardpine board

the number of vehicles arriving at athe number of vehicles arriving at a toll booth in one hourtoll booth in one hour the number of vehicles arriving at athe number of vehicles arriving at a toll booth in one hourtoll booth in one hour

Poisson distribution function

Just as with the binominal distribution this allows the calculation of probabilities!

f xex

x( )

!

f x

ex

x( )

!

where:where:

f(x) f(x) = probability of = probability of xx occurrences in an interval occurrences in an interval

= mean number of occurrences in an interval= mean number of occurrences in an interval

ee = 2.71828 = 2.71828

Poisson’s cumulative distribution function

By definition this is given by:

)Po( XrXP

...!3!2!1!0

)Po(3210 eXrXP

Remembering this pattern helps in the evaluation of the required probabilities since each term in the series are respectively: P(X=0), P(X=1), P(X=2), P(X=3)

Example #2:

Patients arrive at the Casualty department of a hospital at the average rate of 6 per hour on weekend evenings. What is the probability of 4 arrivals in 30 minutes on a weekend evening?

Example #2: Solution

o Simply use the Poisson’s distribution function:

4 33 (2.71828)(4) .1680

4!f

4 33 (2.71828)

(4) .16804!

f

= 6/hour = 3/half-hour, = 6/hour = 3/half-hour, xx = 4 = 4

MERCYMERCY

Poisson’s distribution cont…

Poisson Distribution of ArrivalsPoisson Distribution of Arrivals

Poisson Probabilities

0.00

0.05

0.10

0.15

0.20

0.25

0 1 2 3 4 5 6 7 8 9 10Number of Arrivals in 30 Minutes

Pro

bab

ilit

y

NB: The Poisson’s distribution has the very special property of the mean and variance being equal!

= = 22

Also when n>50, i.e. large and np<5, i.e. small. Then this distribution approximates the Binominal.

Exercise #2:

A serviceman is “beeped” each time there is a call for service.

The number of beeps per hour is Poisson distributed with a

mean of 2 per hour. Find the probability that he gets beeped 3

times in the next 2 hours.

Solution: The units of interval need to be uniform. So, the mean beep rate will be 4 per 2 hour intervals.Application of the Poisson’s probability function, renders: 4 34 (0.0183)(64)

( 3) 0.1953! 6

eP X

Further Example

A garage workshop has an expensive machine tool which is used on

average 1.6 times per 8-hour day for a four hour period. How many

days in 60 day work period is the tool required no more than twice.

217.0258.0323.0202.01

1360217.0

)2()1()0(1 XPXPXPp

!2

6.1

!1

6.1

!0

6.11

2106.1e

Hence the required no. of days

Examination type questions

1. A machine is know to produce 10% defective components, if a random sample of 12 components are taken, evaluate the probability of:

a) No components being defective [2]

b) more than 3 components being defective [3]

2. If jobs arrive at a machine at random average intervals of 10/hr, estimate the probability of the machine remaining idle for a 1.5 hour period [4].

a) State the standard deviation of this distribution [1].

Further examination type questions

3. Over a long period of time it is known that 5% of the total production are below standard. If 6 are chosen at random, evaluate the probability that at least 2 are defective [5].

4. A machine is known to produce 2% defective components. In a packet of 100 what is the probability of obtaining over 2 defective components [5].

Solutions:

1. Here a simple application of the Binominal is required thus:

2. Here let X be a Poisson RV, thus:

2824.090.010.0)!12(!0

!12) 120 a

0852.02301.03765.02842.0

)9.0()1.0()9.0()1.0()9.0()1.0(2824.0) 933

121022

121111

12

CCCb

0!0

15

)15Po(0150

e

XXp

Solutions:

3. This is the binominal model:

4. We could use the binominal here but it is also a Poisson’s approximation with np=100×0.02, thus:

2321.07351.01)95.0()05.0(6)95.0(1

)95.0()05.0()95.0()05.0(1)1()0(1516

511

6600

6

CCXPXP

3233.051

!2

2211)2Po(21

2

22

e

eXXp

Alternative solution 4:

4. We could use the binominal here also, with n=100 and p=0.02:

3234.0

2734.02706.01326.01

)98.0()02.0()98.0()02.0(100)98.0(

)98.0()02.0()98.0()02.0()98.0()02.0(1

))100,02.0Bi(2(1

9822

100991100

9822

1009911

10010000

100

C

CCC

XXP

Note: It’s worth noticing the Poisson’s approximation if it turns up, “less calculations”!

Summary

Evaluate discrete probability distributions from realistic data

Use the binominal distribution to evaluate simple probabilities

Evaluate probabilities using the Poisson’s distribution

Answer examination type question pertaining to these distributions

Have we met out learning objectives? Specifically are you able to: