10
MA 490 Senior Project Project: Prove that the cumulative binomial distributions and the Poisson distributions can be approximated by the Normal distribution and that that approximation gets better as the numbers increase. It is imperative that one first understands the distributions given before proving that they can be approximated. The binomial distribution represents the total number of successes out of n Bernoulli trials under specific conditions: only two outcomes are possible on each of the n trials, the probability of success for each trial is constant, and all trials are independent of each other (Weisstein). A Bernoulli trial, which is used in the binomial distribution, is an experiment with a random result with only two possible outcomes, either a “success” or a “failure.” The following is the binomial distribution: From the following, k represents the number of success in n trials of a Bernoulli process with probability of success p (Weisstein). The binomial distribution is a discrete probability distribution that is used to analyze the possible number of times that a specific event could occur in a particular amount of trials. The following is a graphical representation of the binomial distribution (Weisstein): Furthermore, the cumulative binomial distribution refers to a specific range of data being collected in a binomial distribution (Devore). It uses the same probability distribution as the binomial distribution, but has a specific lower limit and upper limit. The following is a graphical representation of the cumulative distribution (Weisstein):

Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

MA 490

Senior Project

Project: Prove that the cumulative binomial distributions and the Poisson distributions can be approximated by the Normal distribution and that that approximation gets better as the numbers increase.

It is imperative that one first understands the distributions given before proving that they can be approximated. The binomial distribution represents the total number of successes out of n Bernoulli trials under specific conditions: only two outcomes are possible on each of the n trials, the probability of success for each trial is constant, and all trials are independent of each other (Weisstein). A Bernoulli trial, which is used in the binomial distribution, is an experiment with a random result with only two possible outcomes, either a “success” or a “failure.” The following is the binomial distribution:

From the following, k represents the number of success in n trials of a Bernoulli process with probability of success p (Weisstein). The binomial distribution is a discrete probability distribution that is used to analyze the possible number of times that a specific event could occur in a particular amount of trials. The following is a graphical representation of the binomial distribution (Weisstein):

Furthermore, the cumulative binomial distribution refers to a specific range of data being collected in a binomial distribution (Devore). It uses the same probability distribution as the binomial distribution, but has a specific lower limit and upper limit. The following is a graphical representation of the cumulative distribution (Weisstein):

Page 2: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

2

A limiting case of the binomial distribution is the Poisson distribution. The Poisson distribution is another type of probability distribution that represents the probability of a number of times a random event occurs in a given amount of time unit. The Poisson distribution is a discrete one-parameter distribution with the parameter being both the mean and variance of the distribution. The following is the Poisson distribution (Weisstein):

Where t is the number of units of time, λ is the expected number of occurrences in a given interval, and k is the number of times an event occurs. The following is graphical representation of the Poisson distribution (Weisstein):

The cumulative binomial distribution and the Poisson distribution can be approximated by the normal distribution due the central limit theorem. The normal distribution is another probability distribution that is used to approximate continuous random variables around a single mean value. The normal distribution allows individuals to predict where a specific probability

Page 3: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

3

will fall based on such approximation. The following is the normal distribution where µ is the mean and is your variance:

With the understanding of the binomial distribution, Poisson distribution, and the normal distribution, one can begin to prove their approximation. Both the binomial distribution and the Poisson distribution can be approximated by the normal distribution by using the central limit theorem. The central limit theorem states that given a distribution with a mean and a variance, the sampling distribution of the mean approaches a normal distribution as the same size increases (Cryer and Whitmer). The central limit theorem assumes Y1, Y2,…,Yn are independent variables that are distributed identically with a mean µ and a finite variance . It then defines the random variable Un as the following:

One can see that as n in this random distribution function gets larger it converges to the normal distribution function and the approximation becomes more accurate as n becomes even larger.

For further proof of the approximation of these distributions converging to normal distribution, one must prove the central limit theorem. The central limit theorem can be proved using many different methods, but the easiest method of proof is using moment generating functions. Moment generating functions redefines a specific probability distribution by using expected values of a random variable. Moment generating functions are used to find all the moments of a random variable distribution by using simple operation (Bain and Engelhardt). The moment generating function of a random variable X is the function MX(t)= E(etx). The function can then be rewritten since the term etx can be approximated around zero using a Taylor series expansion. Thus the moment function becomes:

2 30 0 2 0 3 0

2 32 3

1 10 0 02 6

12 6

tx t t tXM t E e E e te x t e x t e x

t tE x t E x E x

Page 4: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

4

The normal distribution can be written as a moment generating function, which will be used in the proof of the central limit theorem. To find the moment generating function for the normal distribution, first begin by letting X be a normal random variable with a mean µ and a standard deviation σ. Then (Weisstein),

The term exponential term can be further broken down.

Therefore, the moment generating function for the normal distribution is given as:

or

Now that the background of the proof has been given, one can move to the proof of the central limit theorem. First define a random variable of Zi by

??푍??푖??=??????푌??푖??−휇??????휎??. We can determine that the

mean of Zi will be zero and the variance will be one (Cryer and Whitmer). Next, we can write the moment generating function for Zi as

From the stated central limit theorem, we know that

2

212

2

2

12

1 1exp22

xtx

XM t e e dx

xtx dx

2 2 2 4 2

2 2

22 2

2

21 12 2

1 12 2

x t t txtx

x tt t

22 2

2

2 2

1 1 1exp exp2 22

1exp2

X

x tM t t t dx

t t

Page 5: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

5

Since the random variables Yi are independent, we can conclude that the random variables of Zi are also independent. By the defined properties of moment generating functions, the sum of independent random variables is the product of individual moment generating functions (Cryer and Whitmer). Therefore, we can write

In order to bring n from the exponent position we can apply the natural log, such that

The expression on the right can be rewritten using the Taylor series expansion, which is ln(1+x)=

1+??푥??1!??+????푥??2????2!??+????푥??3????3!??+… .

Therefore our moment generating function becomes

Looking closely at our function we can see that all of the terms except the first time will have a power of n in the denominator (Cryer and Whitmer). Therefore, as n becomes large, or as n∞ all the terms but the first term will go to zero. Furthermore, we want to see what happens to the remaining term as n gets larger, so we must take the limit of mn(t) as n∞

We then can apply e in order to get rid of ln, so we have

The function above should look familiar. Recall from the earlier proof of the moment generating function for the normal distribution being . Since we stated earlier that the

Page 6: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

6

mean is zero and the variance is one we have ??푒??????푡??2????2????. This means that any

given random variable distribution converges towards the normal distribution as n goes towards ∞ (Cryer and Whitmer). Therefore, the approximation becomes more accurate as the larger n becomes. Thus, the central limit theorem can be used to approximate random variable distributions.

As already stated, the cumulative binomial distribution and the Poisson distribution are both random and independent variable probability distribution, so we can apply the central limit theorem in order for them to be approximated by the normal distribution. The best representation of such approximation is using graphs.

The following graph represents a cumulative binomial distribution being approximated with the normal distribution (Boucher, Normal Approximation to a Binomial Random Variable). The cumulative binomial distribution is seen as the red “steps” with n=15 and p=.5. The function seen running over the cumulative binomial distribution is the normal distribution. As you can see, the cumulative binomial distribution is extremely similar to the normal distribution, hence the application of the central limit theorem.

The following graph is using

Page 7: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

7

the same binomial distribution as the graph above, however n has gotten larger, n=55 and p is the same (Boucher, Normal Approximation to a Binomial Random Variable). You can see that the approximation is becoming closer and more accurate due to the distributions “lying on top” of one another.

Similarly to the cumulative binomial distribution, the Poisson distribution can be approximated by the normal distribution (Boucher, Normal Approximation to a Poisson Random Variable). The following graph represent a Poisson distribution with λ=20 and the zoom representing n. Once again you can see that the Poisson distribution overlaps the normal distribution.

The following graph is using the same Poisson distribution, but just extending your view in order for n to become larger (Boucher, Poisson Distribution). Notice that the approximation of the Poisson distribution by the normal distribution becomes more accurate as n increases.

Page 8: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

8

Based on the proof of the central limit theorem, one can approximate a random variable distribution to the normal distribution. The theorem also states that as n increases, the approximation becomes more accurate. When given either a cumulative binomial distribution or a Poisson distribution, they too can be approximated by the normal distribution since they have independent random variables. This has been shown through examples of graphs earlier. This is an important concept to understand since it allows one to further analyze data given in a binomial distribution or a Poisson distribution.

Page 9: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

9

Works Cited

Bain, Lee and Max Engelhardt. Introduction to Probability and Mathematical Statistics. Boston: PWS Publishers, 1987.

Boucher, Chris. Binomial Distribution. 2007. 9 April 2011 <http://demonstrations.wolfram.com/BinomialDistribution/>.

—. Normal Approximation to a Binomial Random Variable. 2007. 9 April 2011 <http://demonstrations.wolfram.com/NormalApproximationToABinomialRandomVariable/>.

—. Normal Approximation to a Poisson Random Variable. 2007. 9 April 2011 <http://demonstrations.wolfram.com/NormalApproximationToAPoissonRandomVariable/>.

—. Poisson Distribution. 2007. 9 April 2011 <http://demonstrations.wolfram.com/PoissonDistribution/>.

Cryer, Jon and Jeff Whitmer. Introduction to the Central Limit Theorem. 26 June 1999. 15 April 2011 <http://courses.ncssm.edu/math/Stat_Inst/PDFS/SEC_4_f.pdf>.

Devore, Jay. Probability and Statistics for Engineering and the Sciences. Ed. Jennifer Burger. 4th Edition. Belmont: Wadsworth Publishing Company, 1995.

Page 10: Project Prove that the cumulative binomial distributions ...thinkingwriting.qmul.ac.uk/wishees/collections/... · A Bernoulli trial, which is used in the binomial distribution, is

10

Falmagne, Jean-Claude. Lectures in Elementary Probability Theory and Stochastic Processes. New York: William Barter, 2003.

Lane, David. Central Limit Theorem. 2008. 14 April 2011 <http://davidmlane.com/hyperstat/A14043.html>.

Weisstein, Eric. Binomial Distribution. 1999. 16 April 2011 <http://mathworld.wolfram.com/BinomialDistribution.html>.

HYPERLINK "http://mathworld.wolfram.com/about/author.html" Weisstein HYPERLINK

"http://mathworld.wolfram.com/about/author.html" , Eric W. "Central Limit Theorem." From HYPERLINK "http://mathworld.wolfram.com/" MathWorld --A Wolfram Web Resource. HYPERLINK

"http://mathworld.wolfram.com/CentralLimitTheorem.html" http://mathworld.wolfram.com/CentralLimitTheorem.html

HYPERLINK "http://mathworld.wolfram.com/about/author.html" Weisstein HYPERLINK

"http://mathworld.wolfram.com/about/author.html" , Eric W. "Moment-Generating Function." From HYPERLINK "http://mathworld.wolfram.com/" MathWorld --A Wolfram Web Resource. HYPERLINK "http://mathworld.wolfram.com/Moment-GeneratingFunction.html" http://mathworld.wolfram.com/Moment-GeneratingFunction.html

HYPERLINK "http://mathworld.wolfram.com/about/author.html" Weisstein HYPERLINK

"http://mathworld.wolfram.com/about/author.html" , Eric W. "Normal Distribution." From HYPERLINK "http://mathworld.wolfram.com/" MathWorld --A Wolfram Web Resource. HYPERLINK "http://mathworld.wolfram.com/NormalDistribution.html" http://mathworld.wolfram.com/NormalDistribution.html

HYPERLINK "http://mathworld.wolfram.com/about/author.html" Weisstein HYPERLINK

"http://mathworld.wolfram.com/about/author.html" , Eric W. "Poisson Distribution." From HYPERLINK "http://mathworld.wolfram.com/" MathWorld --A Wolfram Web Resource. HYPERLINK "http://mathworld.wolfram.com/PoissonDistribution.html" http://mathworld.wolfram.com/PoissonDistribution.html