95
Module II Lecture 4 Special Probability Distributions Certain probability distributions occur with such regularity in real-life situations that they have been given their own names and it is worth studying their properties. In this section, we look at three probability distributions that arise in almost every aspect of business.

Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Embed Size (px)

Citation preview

Page 1: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Module IILecture 4

Special Probability Distributions

Certain probability distributions occur with such regularity in real-life situations that they have been given their own names and it is worth studying their properties.

In this section, we look at three probability distributions that arise in almost every aspect of business.

Page 2: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The Binomial Distribution

Consider the following situations:

a) You audit a transaction, it is either in compliance with procedures or it is not;

b) You hire a person, that person is either a female or a male;

c) You visit a customer, it either leads to a sale or it doesn’t;

d) You lower the price of a product, sales either increase or they don’t;

e) You have a model for the stock market, you predict that it will go up at least 30 points, it either goes up 30 or more points or it doesn’t;

f) You have an intermittent problem on your company’s network, on any given day the problem either appears or doesn’t appear.

All of these situations are ones where the Binomial Distribution may be applicable.

Page 3: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

There is a canonical definition for the binomial distribution. This is, a set of assumptions, which, if they hold, indicate that the binomial distribution may be applied to a particular situation.

Let us suppose that in a given situation only one of two possible things can occur. For example, if we flip a fair coin then we can only get the outcomes heads or tails. Flipping the coin is called an experiment in statistical jargon, and heads or tails are called the possible outcomes of the experiment. Each repetition of the experiment is called a trial .

Page 4: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The binomial distribution applies in the following situation:

a) The outcome of any trial can only take on two possible values, say success and failure;

b) There is a constant probability p of success on each trial;

c) The experiment is repeated n times (i.e. n trials are conducted);

d) The trials are statistically independent (i.e. the outcome of past trials does not affect subsequent trials);

then if x equals the number of successes in the n trials, we have:

for x = 0, 1, 2, …… n.

Page 5: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

For example, if we flipped a fair coin ten times, and let x equal the number of heads, the above formula would give the following probabilities:

x P(x)

0 0.0009771 0.0097662 0.0439453 0.1171884 0.2050785 0.2460946 0.2050787 0.1171888 0.0439459 0.00976610 0.000977

Sum 1

Page 6: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Graphically, the probability distribution looks like:

Binomial Distribution n = 10, p = .50

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

x

Prob

abili

ty

Page 7: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

If we used a biased coin so that the probability of getting a head is only .3, then the probability distribution would look like:

Binomial Distribution n = 10, p = .30

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

x

Prob

abili

ty

Page 8: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

If the coin were extremely biased so that the probability of heads was only .05, then the distribution would look like:

Binomial Distribution n = 10, p = .05

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 1 2 3 4 5 6 7 8 9 10

x

Prob

abili

ty

Page 9: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

EXCEL allows one to compute the binomial probability distribution directly. The form of the function is:

=binomdist(x, n, p, condition),

where x is the value of interest, n is the number of trials, p is the probability of success and condition is either “false” or “true”.

If you specify the following command,

=binomdist(3, 10, .50, false),

then EXCEL will compute the probability that x = 3.

If you use the command,

=binomdist(3, 10, .50, true),

then Excel will compute the probability that . In other words EXCEL will accumulate the probabilities for x = 0, x = 1, x = 2, and x = 3 and report the total.

Page 10: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The following table shows the use of both conditions in the case where n = 10, and p = .5:

x P(x) P(<=x)

0 0.000977 0.0009771 0.009766 0.0107422 0.043945 0.0546883 0.117188 0.1718754 0.205078 0.3769535 0.246094 0.6230476 0.205078 0.8281257 0.117188 0.9453138 0.043945 0.9892589 0.009766 0.99902310 0.000977 1.000000

Sum 1

Page 11: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

One can show that,

E(x) = np,

and,

If instead of x, the number of successes, we are interested in

that is the proportion of successes in n trials, then one can show that

and,

In our case, E(x) = 10 * .5 = 5,

and,

Page 12: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Let us apply the binomial distribution to a more practical problem then flipping coins.

Suppose that you are going to hire 10 persons from a pool of qualified candidates which is 30 % women. You find that only 1 woman, and 9 men were hired. Is this evidence that the firm is discriminating against women in hiring?

The first question is to determine if the binomial distribution is applicable.

Clearly each hire can only be a man or woman so there are only two possible outcomes.

The hires are probably independent of one another.

The major problem is whether or not the probability of success, p, is constant from trial to trial. If there were only a total of 20 applicants, 6 women and 14 men, then if you hired one of the women on the first hire, that would leave 5 women and 14 men which would mean the probability of hiring a woman for the second hire could only be

5 / 19 = .2632,

which is very large change from the initial probability of .30. On the other hand if the hiring pool consisted of 100 applicants, 30 women and 70 men, then if you hired one of the women on the first hire, that would leave 29 women and 70 men, which would mean the probability of hiring a woman for the second hire would be

29 / 99 = .2929

which is a very small change.

The actual probability distribution to use is called the hypergeometric distribution. However it is well know that if the probability p in the binomial distribution does not change much from trial to trial, then the results from the hypergeometric distribution and the binomial distribution are almost identical.

Assuming that the probability of hiring a woman does not change much over the 10 hires, then we can reasonably assume that the probability is approximately constant over the 10 hires and the assumptions of the binomial distribution are approximately fulfilled.

Page 13: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The next problem we face is that any value between 0 and 10 can possibly occur. Indeed the probability distribution for this situation, i.e. binomial with n = 10, and p = .3, is given in the table below:

Female CumulativeHires Prob Prob

0 0.028248 0.0282481 0.121061 0.1493082 0.233474 0.3827833 0.266828 0.6496114 0.200121 0.8497325 0.102919 0.9526516 0.036757 0.9894087 0.009002 0.9984108 0.001447 0.9998569 0.000138 0.99999410 0.000006 1.000000

Sum 1.000000

Notice that although any value is possible the values are not all equally probable. For example it would not at all seem odd if we hired 3 women or 2 women or 4 women since these values all have reasonably high probabilities. On the other hand, it would seem odd if we hired 10 women since the probability of this outcome is approximately 6 chances in 1, 000, 000. Almost as rare as winning the Texas Lottery!!

Page 14: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Statistical logic works like this:

a) define what you think is a rare event (most users of statistics define rare as 1 chance in 20 [.05] or 1 chance in 100 [.01]);

b) if the probability of the observed result or anything more extreme is less than what you define as rare, then the assumed value of p is suspect.

In our case we observed 1 female hire. More extreme is to hire 0 women. Therefore we want the probability of observing 1 or fewer women. This can be obtained directly from the above table or by using the =binomdist(1, 10, .3, true) command.

The result is a probability of .149308. This is roughly a chance of 1 in 7 which most people would not think is rare. Accordingly this data would not be suggestive of disproportionate hiring of women. Of course if it happened more than once, it might be indicative. Suppose a month later the same thing happens. The probability of hiring 1 woman in 10 hires from a pool that is 30% women twice would be:

(.149308) * (.149308) = .0223

which for many people would make one suspect of a fair hiring environment.

Page 15: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Graphically, the probability distribution we would expect for x = the number of women hired when you are hiring for 10 positions (n) from a pool of qualified applicants which is 30% female ( p = .30) is given below with the observed and more extreme value highlighted:

Number of Women Hires

0.000000

0.050000

0.100000

0.150000

0.200000

0.250000

0.300000

0 1 2 3 4 5 6 7 8 9 10x

Prob

abili

ty

Page 16: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Now let us leave the percentage of women in the qualified pool the same as previously (i.e. p =.3) but now hire for 50 positions (n=50). And assume again we only hire 10% women (x=5). Then the probability distribution would look like:

As can be seen, hiring only 5 women in 50 hires from a pool of 30 % women, is a relatively rare event. Using the binomdist function I can compute:

P(x<= 5) = binomdist(5, 50, .3, true) = .00072.

This amounts to a chance of approximately 7 in 10,000 which is highly improbable. Accordingly, we would be suspect in this situation that women are being hired proportionate to their representation in the applicant pool

.

Number of Women Hires

0

0.05

0.1

0.15

0 4 8 12 16 20 24 28 32 36 40 44 48

x

Prob

abilit

y

x=5

Page 17: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

It is very easy to simulate the binomial distribution. Suppose we wish to simulate values of x from a binomial distribution with n = 10 and p = .4.

For each trial, we could use the statement:

=if(rand()<=.4, 1, 0)

This would generate a value of 1 approximately 40% of the time. If we repeated the above statement 10 times and added up the results, this would be equivalent to taking a sample of n = 10 and observing x where x followed the binomial distribution with p = .4.

The above procedure, however, is cumbersome. Notice that what we really did was to divide the range of values between 0 and 1 into two regions. The first went from 0 to .4. If the random number fell in this range, we said that the outcome should be 1. If the random number fell in the range .4 to 1 we said the outcome should be 0.

If we had a random variable with three possible outcomes, then we could divide the interval between 0 and 1 into three regions with the size of each region proportional to its probability.

Page 18: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

In our case, with n = 10, we have eleven possible values which can occur, namely the values 0, 1, 2, . . . , 10. We wish to divide these proportionally to their probability given as:

Binomial Distributionn = 10 p = .4

x P(x)

0 0.00604661 0.04031082 0.12093243 0.21499084 0.25082275 0.20065816 0.11147677 0.04246738 0.01061689 0.001572910 0.0001049

Sum 1

Page 19: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

This can easily be accomplished by computing the cumulative binomial probability distribution which is shown below:

CumulativeBinomial Distribution

n = 10 p = .4

x P(<=x)

0 0.00604661 0.04635742 0.16728983 0.38228064 0.63310335 0.83376146 0.94523817 0.98770548 0.99832239 0.999895110 1

We now set up the following rule:

If Random Number is Then xbetween =

0 and 0.006047 00.006047 and 0.046357 10.046357 and 0.16729 20.16729 and 0.382281 3

0.382281 and 0.633103 40.633103 and 0.833761 50.833761 and 0.945238 60.945238 and 0.987705 70.987705 and 0.998322 80.998322 and 0.999895 90.999895 and 1 10

For example if we generated the random number .188754, this value falls between .16729 and .382281 so we would say that x = 3 for those 10 trials. If we generated a second random number, say .99461, then this value falls between .987705 and .998322 so that we would say that x = 8 for those 10 trials.

Page 20: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The entire process can be automated in further in EXCEL using the function “LOOKUP”.

This function has three arguments:

The first is the value we wish to look up, in this case this is the random number.

The second argument is the table in which you want to look up the probability, in our case this is the cumulative probability distribution column (but we must add the value “0” in the row proceeding the first probability).

Finally, the third argument is the table containing the results, in our case the values of x.

Page 21: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The entire process is shown in the following screen shot:

Page 22: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The first argument of the function “LOOKUP” is the random number in column F row 175 (shown in blue). The second argument is the table of the cumulative binomial distribution (with zero added) shown in green. The third argument is the actual value of the lookup process, the value of x shown in lavender.

Notice that the ranges of the second and third arguments have the symbol “$” prefixing both the column and row entry. This is necessary so that if the entries are copied (as in using a table of random numbers), the relevant look up table entries remain constant since in EXCEL relative addressing is always used.

The following steps a), b), and c) show the entire process simulating 25 times the number of success when 10 trials are run.

Simulate Binomial with n = 10 and p = .4

a) Generate Distribution b) Generate Random Numbers binomdist(x,10,.4,true)

x 0 0.188754 0.99461 0.280296 0.997045 0.5628530 0.006047 0.105325 0.608283 0.161509 0.394184 0.4905151 0.046357 0.555951 0.207534 0.404118 0.052352 0.3633312 0.16729 0.480493 0.129214 0.220414 0.892615 0.5838433 0.382281 0.131033 0.085797 0.821619 0.335211 0.3165014 0.6331035 0.8337616 0.9452387 0.987705 c) Use LOOKUP(x,ProbCOL,ResultCOL) after anchoring results8 0.9983229 0.999895 3 8 3 8 410 1 2 4 2 4 4

4 3 4 2 34 2 3 6 42 2 5 3 3

Page 23: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The procedure described above will work for any discrete probability distribution. In 2001, I discovered and esoteric function in EXCEL which in fact will do the above procedure in one step, but only for the Binomial Distribution. The name of this function is “Critbinom” and has the following syntax:

=critbinom(n, p, random number)

The table below shows the application of the above function to the same random numbers used in the previous example. You will note that the results are identical to those previously obtained using the lookup table.

a) Generate Random Numbers

0.188754 0.99461 0.280296 0.997045 0.5628530.105325 0.608283 0.161509 0.394184 0.4905150.555951 0.207534 0.404118 0.052352 0.3633310.480493 0.129214 0.220414 0.892615 0.5838430.131033 0.085797 0.821619 0.335211 0.316501

b) Use CRITBINOM(10, .4, random number)

3 8 3 8 42 4 2 4 44 3 4 2 34 2 3 6 42 2 5 3 3

This function replaces a rather complicated three stage procedure with a simple two step procedure.

Page 24: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The Poisson Distribution

The Poisson Distribution is another distribution which arises in a great number of business situations. It usually is applicable in situations where random phenomena occur at a certain rate over a period of time. For example, it describes the number of people in line at a checkout counter as well as the number of telephone calls received at a switching point. It, like the Binomial Distribution, has a canonical definition.

Assume you have an “exposure” variable such as time (it does not have to be time, but it has to be continuous). Assume that this time period can be divided into small enough increments, say of width dt, so that in any one of these intervals something happens or doesn’t happen. For example, consider phone calls during an hour period. Obviously we can divide the hour into small intervals, maybe of width 1 second, so that we can either receive one call or no call.

Assume the probability of an event occurring in an interval of width dt is dt. Assume further that the occurrence or non-occurrence of an event in one interval is independent of the occurrence or non-occurrence of the same event in another interval.

Then if one defines,

= t

where t is the length of the interval, then the probability of x occurrences in the interval of length t is given by,

for x = 0, 1, 2, . . . .

Page 25: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

One can show that for the Poisson Distribution with parameter ,

and,

Page 26: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

One can usually easily recognize situations where the Poisson Distribution is applicable since they usually involve a rate and an exposure.

For example, suppose an office receives, on average, 15 calls per hour. In a two hour period the office received 45 calls. Is this a rare event? Here the rate is 15 calls per hour and the exposure is a two hour period. This implies that

= 15 * 2 = 30.

As another example, suppose a manufacturing plant has, on average, 24 accidents per year. In a one month period it has 5 accidents. Is this a rare event? Here the rate is 24 accidents per year and the exposure is one month ( 1/12 of a year). This implies that

= 24 * (1 /12) = 2.

The exposure rate does not have to be time. For example consider the situation where, on average, a driver has 1 accident per 50,000 miles driven. Suppose the person drives a vehicle for 100,000 miles and has no accidents is this a rare event? Here the rate is 1 accident per 50,000 miles driven and the exposure is 100,000 miles so that

= 100,000 * ( 1 / 50,000) = 2.

Page 27: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The probability distribution for this last case is given below:

x P(x) P(<=x)

0 0.135335 0.1353351 0.270671 0.4060062 0.270671 0.6766763 0.180447 0.8571234 0.090224 0.9473475 0.036089 0.9834366 0.012030 0.9954667 0.003437 0.9989038 0.000859 0.9997639 0.000191 0.99995410 0.000038 0.99999211 0.000007 0.99999912 0.000001 1.00000013 0.000000 1.000000

Note that the probabilities continue on past 12, it is just that they are so small that they appear in the table as zero.

Graphically, the distribution is as shown below:

Poisson Distribution mu =2

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0 1 2 3 4 5 6 7 8 9 10 11 12 13

x

Prob

abili

ty

Page 28: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Now consider the situation where we are inspecting parts for defects. Assume we have a defective rate of 1/1000. If we inspect 1000 parts and observe 3 defects, should we worry? In this case = 1000 * ( 1 /1000) = 1.

As in the binomial case, EXCEL can be used to compute the probability of this or any more extreme event. The function to use has the form,

=Poisson(x, mu, condition),

where, just as in the case of the binomial distribution, a condition of ‘false’ gives us the probability of x , and a condition of ‘true’ gives us the probability of being less than or equal to x.

In this case we want the probability of 3 or anything more extreme, that is the probability of three or more. We can find,

so that,

which is not rare by the usual standards of .05 or .01.

Page 29: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

A picture of the probabilities to be added is shown below:

Defective Rate .001, n = 1000

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Defects

Prob

abili

ty

Page 30: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Now assume we scale up the situation and inspect 5,000 parts. Then the parameter would change to

= 5,000 * (1 / 1000) = 5.

If we scale up the defective rate to mirror that in the first problem, this would correspond to observing 5 * 3 = 15 defectives.

Graphically the Poisson Distribution with = 5 would look like:

00.020.040.060.08

0.10.120.140.160.18

0.2

0 2 4 6 8 10 12 14 16 18 20 22 24

x=15

Page 31: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

By using the Poisson function in EXCEL, one obtains:

which is clearly a relatively rare event which might encourage us to improve quality control.

Page 32: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Actually, I have been misleading you a bit. For in fact if we inspect 5,000 parts each of which has a probability of .001 ( 1 / 1000) of being defective, I am really describing a Binomial situation.

However if n is large and p is small, so that n * p is moderate, then the Poisson distribution can be used to approximate the binomial distribution by taking

= n * p.

Page 33: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The graph below shows how good this approximation is in this case with n = 5,000 and p = .001 and = 5.

The Poisson distribution gave the probability of 15 or more defectives as .000226, while the exact value from the Binomial distribution is .000224, a small error in the sixth decimal place!

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

Prob

0 2 4 6 8 10 12 14 16 18 20 22 24

x

Poisson Approximation to Binomial

poisson

binomial

Page 34: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Simulating a Poisson distribution is essentially the same as the long procedure for the Binomial Distribution (unfortunately no “one step” function exists for the Poisson distribution that can be used like “critbinom” for the binomial).

Below is a screen shot of the procedure:

Page 35: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

A step by step illustration of simulating 25 realizations of the Poisson Distribution with parameter 5 is shown below:

Simulate a Poisson with lambda = 5

a) Generate Distribution b) Generate Random Numbers poisson(x,5,true)

x 0 0.188754 0.99461 0.280296 0.997045 0.5628530 0.006738 0.105325 0.608283 0.161509 0.394184 0.4905151 0.040428 0.555951 0.207534 0.404118 0.052352 0.3633312 0.124652 0.480493 0.129214 0.220414 0.892615 0.5838433 0.265026 0.131033 0.085797 0.821619 0.335211 0.3165014 0.4404935 0.615961 c) Use LOOKUP(x,ProbCOL,ResultCOL) after anchoring results6 0.7621837 0.866628 3 12 4 12 58 0.931906 2 5 3 4 59 0.968172 5 3 4 2 4

10 0.986305 5 3 3 8 511 0.994547 3 2 7 4 412 0.99798113 0.999302

Page 36: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The Normal Distribution

The normal distribution (the so called "curve") is perhaps the best known probability distribution since it arises so many situations.

In business applications is commonly found to describe the distribution of the rate of return on investments. However much business data is right skewed. If x is a typical business statistics, for example the assets of banks or the gross sales of companies, one usually finds that many small firms have modest values with a few very large values. This gives rise to a right skewed distribution. However, if one looks at the log (assets of banks) or the log(gross sales), one finds that the logarithmic value is approximated closely by the normal distribution. Such variables are said to have the Log-Normal distribution.

Page 37: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Unlike the Binomial and Poisson distributions, the normal distribution is defined, theoretically, for continuous variables, that is variables with no gaps between potential values. Of course, in the real world, one never measures things to very many decimal places so that we can think of the normal distribution applying to many everyday variables. For example, if a man says he is six feet tall, he probably does not mean that he is exactly six feet tall. What is usually meant is that to the nearest inch, the man is six feet tall. That is his actual height is probably between five foot eleven and one half inches, and six feet and one half inch. Formally this is called "discretizing" the normal distribution.

Since the normal is a continuous curve, it does not have a probability distribution. Instead it has what is called a probability density function. A probability density function is a non-negative function f(x), which has the property that:

and

Page 38: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The form of the function f(x) for a normal distribution is:

The normal distribution depends on two parameters and . We have used these symbols before to describe the mean and standard deviation of a population. For the normal distribution:

and,

Page 39: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Below is a picture of two normal distributions which have the same standard deviations but different means:

Normal Distributions

00.05

0.10.15

0.20.25

0.30.35

0.40.45

-6

-5.1

-4.2

-3.3

-2.4

-1.5

-0.6 0.3

1.2

2.1 3

3.9

4.8

5.7

x

f(x)

Increasing the mean moves the normal curve to the right, while decreasing the mean moves the curve to the left.

Page 40: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Below are shown normal distributions with the same mean but with different standard deviations:

Normal Distributions

00.20.40.60.8

1

-7 -5.9

-4.8

-3.7

-2.6

-1.5

-0.4 0.7 1.8 2.9 4 5.1 6.2

x

f(x)

Increasing the standard deviation makes the curve more spread out and lower (this has to occur since the total area under the curve is always 1). Decreasing the standard deviation makes the curve less spread out and higher.

Page 41: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

It is very easy to work with the normal distribution using EXCEL. Like the case of the Binomial and Poisson distributions, EXCEL provides a function for computing values for any normal distribution. The form of the function is :

=normdist(x, mean, sd, condition)

As in the case of the Binomial and Poisson distributions, setting condition = "true" gives the probability of being less than or equal to a particular value. If the condition is set to "false" then one gets the value of f(x) which unlike the Binomial and Poisson distributions is not a probability.

EXCEL also has the function:

=normsdist(z)

which gives the probability of being less than or equal to the value z for the special normal distribution with a mean of 0 and a standard deviation of 1 (called the Standard Normal Distribution). The Standard Normal Distribution, in the past, was quite important since it was used to compute probabilities for any normal distribution by just using a table of the Standard Normal Distribution and the transformations:

z = (x - ) / ,

and the inverse relationship,

x = + z

With the advent of worksheet programs, one no longer needs these tables since the computer can generate the probabilities of interest directly.

Page 42: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Let us consider an application of the normal distribution. In 1980, the height of adult males in the United States was approximately normally distributed with an average height of 67 inches and a standard deviation of 2.1 inches. (In 2000, the average height has jumped to 69 inches). In 1980, what was the probability that a randomly chosen adult male would be six feet tall or taller?

Graphically, we are interested in finding the probability in the white area in the graph below:

Height Distribution

00.05

0.10.15

0.2

60.5

61.5

62.5

63.5

64.5

65.5

66.5

67.5

68.5

69.5

70.5

71.5

72.5

73.5

Height

f(x)

Page 43: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

EXCEL can be used to find the area of the shaded area in the above graph by using the command:

The probability of being six feet tall or greater (the area of the white area) is then given by:

or slightly less than 1% of the population.

Page 44: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Now suppose we wished to find the proportion of adult U.S. males who were between 5 foot three (63 inches) and 5 foot ten (70 inches) tall. This is the shaded area in the graph below:

Normal Distribution

00.020.040.060.08

0.10.120.140.160.18

0.2

60.5

61.2

61.9

62.6

63.3 64

64.7

65.4

66.1

66.8

67.5

68.2

68.9

69.6

70.3 71

71.7

72.4

73.1

Height

f(x)

Page 45: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

To find this probability we need to realize that:

or in EXCEL terminology:

This gives the answer as:

895031.028405.923436.)70x63(P

Approximately 89.5% of adult U. S. males are between 5 foot three and 5 foot 10 in height.

Page 46: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Suppose I was interested in the inverse problem, that is what are the heights of 95% of adult U.S. males?

Actually there are many ways to do this. I could for example find the minimum height that 95% of the population is greater than. Or I could include the lower 95 %. Or I could try to get the "middle" 95%.

The middle 95 % would have 2.5% of the observations larger and 2.5% smaller. This is shown in the picture below:

Normal Distribution

00.020.040.060.08

0.10.120.140.160.18

0.2

60.5

61.2

61.9

62.6

63.3 64

64.7

65.4

66.1

66.8

67.5

68.2

68.9

69.6

70.3 71

71.7

72.4

73.1Height

f(x) 95 %

2.5%2.5%

Page 47: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

EXCEL has a function which solves the following equation for x0 given any value of p for the normal distribution:

p)xx(P 0

The function is:

),,p(NORMINV

Therefore in our case, we find that:

62.884 = norminv(.025, 67, 2.1)

and,

71.116 = norminv(.975, 67, 2.1).

Page 48: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Therefore 95% of the heights fall between the values of 62.884 inches and 71.116 inches.

Recall that the mound rule indicated that approximately 95% of the values fell within the interval +/- two standard deviations. The values 62.884 and 71.116 correspond to +/- 1.96 standard deviations which is the more precise figure.

EXCEL also has the function:

=normsinv(p)

which is the inverse function for the the standard normal distribution (mean of zero and standard deviation of 1). Directly we could have obtained:

-1.96 = normsinv(.025)

and

1.96 = normsinv(.975).

In a non-computerized statistics course, the above values would have been obtained from a table in the back of the book and then transformed to get:

67 – 1.96 * 2.1 = 62.884

and

67 + 1.96 * 2.1 = 71.116

In EXCEL we can do this directly using the norminv function.

Page 49: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

For large values of n, we can use the normal distribution to approximate the binomial distribution by taking:

This approximation will be valid if np > 5 and n(1 – p) > 5.

For example in the case discussed previously with n = 50 and p = .3, we would have

= 50 * .3 = 15

and in this case np = 50(.3) = 15 and n(1-p) = 50 (.7) = 35 so that the approximation should be good.

Page 50: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The two curves are plotted below:

Normal Approximation to Binomial

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48

x

Prob binomial

normal

Notice however that the normal curve is continuous while the binomial distribution is discrete with nothing between the values of say 17 and 18.

Page 51: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

This distinction between discrete and continuous is important. For the binomial distribution

so that a distinction must be made between "less than or equal" and "less than". For the normal distribution however,

since there is no probability that x will exactly equal 10 (by exactly we mean to an infinite number of decimal places).

Page 52: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

We can get around this problem of continuous versus discrete by the use of what is called the "continuity correction". Simply it says that when using the normal distribution to approximate a discrete distribution (such as the Binomial or Poisson), assume that any discrete value, say 10, actually goes half way between the previous discrete value and the subsequent discrete value. In other words when using the normal distribution we assume that 10 actually goes from 9.5 to 10.5; 29 would go from 28.5 to 29.5, etc. If we let k be one of the discrete values, then the following five relationships illustrate the use of the continuity correction in all possible cases:

Page 53: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Fortunately, in EXCEL we can use the binomdist function for most cases and need to use the normal approximation to the Binomial only infrequently.

The Poisson distribution can also be approximated by the normal distribution if the Poisson parameter > 5. By taking as given and taking the standard deviation as the square root of , we can approximate the Poisson distribution with the normal distribution.

The following graph shows the Poisson distribution with mean of 5 and the normal distribution with a mean of 5 and standard deviation equal to 2.2361 (the square root of 5):

Normal Approximation to Poisson

00.020.040.060.080.1

0.120.140.160.180.2

0 2 4 6 8 10 12 14 16 18 20 22 24

x

Prob Poisson

Normal

Again if using the normal distribution to compute probabilities for the Poisson distribution one must correct for continuity in the same way as was done for the binomial distribution.

Page 54: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

It is very easy to simulate data which follows the normal distribution using the norminv function in EXCEL.

The first step is to generate a set of random numbers using the rand() function as we have done before. Some sample data is shown below:

a) Use RAND() to simulate 50 random numbers

0.407998 0.327113 0.785353 0.167275 0.016527 0.524512 0.445768 0.098211 0.739599 0.8156190.678836 0.059317 0.764079 0.228678 0.055253 0.850148 0.517072 0.37086 0.422763 0.165053

0.1953 0.250051 0.962008 0.09676 0.495551 0.021171 0.132306 0.589669 0.349044 0.1769480.013617 0.945737 0.29648 0.938787 0.034023 0.353435 0.298913 0.840628 0.014406 0.6892080.927638 0.541888 0.09178 0.58483 0.825083 0.348549 0.309452 0.341382 0.926097 0.406182

Page 55: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Now suppose we were looking at an investment with a mean return of 8% with a risk (sd) of 2%. Then we could generate 50 normally distributed returns by applying the function

=norminv(random number, .08, .02)

to the 50 random numbers previously generated to get:

b) Use NORMINV(x, .08, .02) to generate the normal random variables

0.075346 0.071042 0.095808 0.0607 0.037372 0.08123 0.077273 0.054164 0.092842 0.0979760.089289 0.048789 0.09439 0.065136 0.048082 0.100741 0.080856 0.073408 0.076103 0.0605220.062829 0.066513 0.11549 0.053995 0.079777 0.039397 0.057689 0.084534 0.072242 0.0614590.035837 0.112097 0.069309 0.110893 0.043506 0.072479 0.069449 0.099941 0.036279 0.0898720.109168 0.082104 0.053403 0.084285 0.098698 0.072215 0.070052 0.071826 0.108946 0.075252

Page 56: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Finally, one should check that the simulation is approximately on target.

For this data, the sample mean of the simulated values is .0748 compared to the theoretical value of .08. The standard deviation of the simulated values if .0210 compared to the theoretical value of .02. Finally, I have done a histogram of the simulated values which is shown below:

Frequency Distribution

0

2

4

6

8

10

12

14

0.03

0.04

0.05

0.06

0.07

0.08

0.09 0.1 0.1

10.1

2More

Simulated Returns

Freq

uenc

y

As can be seen it is approximately normally distributed.

Page 57: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

What if we wanted to simulate correlated investments as we studied earlier in this module? Specifically suppose we wanted to simulate 25 years of returns on two investments. The first having a mean return of .08 and a standard deviation of .02 and the second having a mean return of .12 with a standard deviation of .05. And suppose that the investments are correlated with a correlation coefficient of -.4.

The difficult part is generating the values so they are correlated. Fortunately there is a theoretical result that says if z1 and z2 are two independent random variables each with mean 0 and standard deviation 1, and we define two new variables x1 and x2 with the equations:

then x1 and x2 will both still have mean 0 and standard deviation 1, but now the x's will be correlated with correlation coefficient r.

Let us illustrate this procedure in steps. First generate two columns of 25 random numbers using the =rand() EXCEL function. The data would look like:

Page 58: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

a) Start with random numbers in two sets of 25

0.758384 0.3117350.301347 0.875990.741409 0.3062480.591236 0.702280.742038 0.5181240.764761 0.705560.928779 0.0862940.754592 0.9143540.05917 0.699389

0.324527 0.917370.389161 0.7816620.159801 0.3696740.621819 0.3592460.091531 0.789650.140335 0.8015050.444617 0.5562150.799008 0.5123050.266267 0.6868240.215947 0.2980830.031036 0.077190.014389 0.3371210.526516 0.5979150.129849 0.066530.459673 0.6448280.565416 0.1606

Page 59: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Next use the EXCEL function =normsinv(random number) to generate two columns of uncorrelated normal random variables with mean zero and standard deviation 1. The data would look something like this:

b) Generate two independent set of Normal with Mean of 0 and sd of 1 by using Normsinv (x)0.701114 -0.49094-0.52053 1.155174 Check correlation using CORREL0.647696 -0.506510.230726 0.53097 0.0176930.64964 0.045444

0.721701 0.540461.466761 -1.363930.689011 1.368067-1.56178 0.522645-0.45508 1.387595-0.28151 0.77782-0.99527 -0.332720.310261 -0.36047-1.33138 0.805206-1.07881 0.847008-0.13927 0.141380.838082 0.030848-0.62414 0.486869-0.78595 -0.52992-1.86578 -1.42423-2.18651 -0.420330.066516 0.247955

-1.1271 -1.50215-0.10126 0.3713940.164715 -0.99199

Using the =correl(z1, z2) we get a correlation of .017693 compared to the theoretical value of 0.

Page 60: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Next we implement the formula given above to induce the appropriate correlation, in this case r = -.4. The formula would look like:

c) Call columns z1 z2

d) Generate two columns x1 and x2 by using the formula

x1=sqrt((1+R)/2)*z1 + SQRT((1-R)/2)*z2

x2=sqrt((1+R)/2)*z1 - SQRT((1-R)/2)*z2

Page 61: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The actual formula contained in the cell for x1 in shown on the following screen shot:

Page 62: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

The actual formula for the cell for x2 is shown in the screen shot below:

Page 63: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

After transforming all the pairs, one would get the following result:

e) Example make R = -.4

-0.02673 0.7947650.681382 -1.25159 Check-0.06902 0.7785370.570615 -0.31787 -0.29580.393844 0.3178010.847474 -0.05689-0.33777 1.9445281.521994 -0.76722-0.41814 -1.29270.911689 -1.41020.496583 -0.80496

-0.8235 -0.26676-0.13166 0.471531-0.05555 -1.402910.117767 -1.299550.042005 -0.194570.484846 0.4332270.065486 -0.7492-0.87385 0.01288-2.21353 0.169663-1.54928 -0.845930.243886 -0.17102-1.87413 0.6394460.255269 -0.36619-0.73974 0.920179

Page 64: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Finally we need to adjust the generated values to have the appropriate means and standard deviations. The first investment is supposed to have a mean of .08 and a standard deviation of .02, therefore we create the new variable

.08 + .02 * x1

The second investment is supposed to have a mean of .12 and a standard deviation of .05, so we create the new variable

.12 + .05 * x2

The final results are shown below:

f) Now multiply x1 by .02 and add .08 and multiply x2 by .05 and add .12

0.079465 0.1597380.093628 0.057420.07862 0.158927

0.091412 0.1041070.087877 0.135890.096949 0.1171560.073245 0.2172260.11044 0.081639

0.071637 0.0553650.098234 0.049490.089932 0.0797520.06353 0.106662

0.077367 0.1435770.078889 0.0498540.082355 0.0550230.08084 0.110272

0.089697 0.1416610.08131 0.08254

0.062523 0.1206440.035729 0.1284830.049014 0.0777040.084878 0.1114490.042517 0.1519720.085105 0.101690.065205 0.166009

Check resultsAverage 0.078016 0.11057 corr= -0.2958Sd 0.017468 0.042937

Page 65: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

As can be seen the simulation results agree reasonably with the theoretical values.

I could now simulate what would happen for a 25 year period into the future if I invested $10,000 in each investment. I just need to add one to the simulated returns and cumulate the investment history as shown below:

Investment Investment1 2

Initial Investment $10,000 $10,000

Year 1 $10,795 1.079465 $11,598 1.1597832 $11,805 1.093628 $12,264 1.057423 $12,733 1.07862 $14,213 1.1589274 $13,897 1.091412 $15,692 1.1041075 $15,119 1.087877 $17,825 1.135896 $16,584 1.096949 $19,913 1.1171567 $17,799 1.073245 $24,239 1.2172268 $19,765 1.11044 $26,218 1.0816399 $21,181 1.071637 $27,669 1.05536510 $23,262 1.098234 $29,039 1.0494911 $25,353 1.089932 $31,355 1.07975212 $26,964 1.06353 $34,699 1.10666213 $29,050 1.077367 $39,681 1.14357714 $31,342 1.078889 $41,659 1.04985415 $33,923 1.082355 $43,951 1.05502316 $36,666 1.08084 $48,798 1.11027217 $39,954 1.089697 $55,711 1.14166118 $43,203 1.08131 $60,309 1.0825419 $45,904 1.062523 $67,585 1.12064420 $47,544 1.035729 $76,268 1.12848321 $49,875 1.049014 $82,195 1.07770422 $54,108 1.084878 $91,355 1.11144923 $56,409 1.042517 $105,239 1.15197224 $61,209 1.085105 $115,940 1.1016925 $65,200 1.065205 $135,188 1.166009

Total Porfolio Value After 25 Years = $200,388

Page 66: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

By simply pressing the F9 key, I would recompute all of the values in the above simulation to get results such as:

f) Now multiply x1 by .02 and add .08 and multiply x2 by .05 and add .12

0.075041 0.0116280.061764 0.2089170.083372 0.0442810.094959 0.1219460.081673 0.1135090.063944 0.1052070.06562 0.172036

0.094026 0.1412380.062977 0.1410870.055172 0.164970.058103 0.0888220.091082 0.1100320.06549 0.0867190.08358 0.11165

0.067008 0.08650.099086 0.0695890.07221 0.128037

0.060327 0.1122460.072065 0.0599080.090064 0.0953440.074848 0.1616080.086461 0.1112510.111891 0.0916420.112906 0.082980.072155 0.104165

Check resultsAverage 0.078233 0.109012 corr= -0.30182Sd 0.016182 0.042203

Page 67: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

And another realization of a 25 year investment as:

Investment Investment1 2

Initial Investment $10,000 $10,000

Year 1 $10,750 1.075041 $10,116 1.0116282 $11,414 1.061764 $12,230 1.2089173 $12,366 1.083372 $12,771 1.0442814 $13,540 1.094959 $14,329 1.1219465 $14,646 1.081673 $15,955 1.1135096 $15,583 1.063944 $17,634 1.1052077 $16,605 1.06562 $20,667 1.1720368 $18,167 1.094026 $23,586 1.1412389 $19,311 1.062977 $26,914 1.14108710 $20,376 1.055172 $31,354 1.1649711 $21,560 1.058103 $34,139 1.08882212 $23,524 1.091082 $37,895 1.11003213 $25,064 1.06549 $41,182 1.08671914 $27,159 1.08358 $45,780 1.1116515 $28,979 1.067008 $49,740 1.086516 $31,850 1.099086 $53,201 1.06958917 $34,150 1.07221 $60,013 1.12803718 $36,211 1.060327 $66,749 1.11224619 $38,820 1.072065 $70,748 1.05990820 $42,316 1.090064 $77,493 1.09534421 $45,484 1.074848 $90,016 1.16160822 $49,416 1.086461 $100,031 1.11125123 $54,945 1.111891 $109,198 1.09164224 $61,149 1.112906 $118,259 1.0829825 $65,561 1.072155 $130,578 1.104165

Total Porfolio Value After 25 Years = $196,139

Page 68: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

By repeatedly recording the Total Portfolio Value after 25 years for each simulation, I could obtain the mean and variation of the results. This is illustrated in the table below for ten simulations of the twenty five year period:

25 Year Portfolio ValueSimulation at end of

Number 25 Years

1 $200,3882 $196,1393 $189,4504 $208,5905 $279,6176 $225,8837 $203,9368 $267,4669 $203,50710 $191,895

Mean $216,687SD $31,745

Notice that the SD (risk) of the return after 25 years of this portfolio is very high. By Chebyshev's rule the 25 year return would vary between approximately $153,200 and $280,200 at least 75 % of the time.

Note: 10 is a very small number of simulations. 100 would be much better

Page 69: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Simulating a Decision Problem

In Lecture 3 of this module we discussed how to use decision trees to help make decisions in an uncertain environment. It is possible to simulate the problem outlined by the decision tree as an alternative to the formal procedures outlined earlier. Below is decision tree of that problem.

Page 70: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Your Consultants Your Actual Profit inDecision Say Decision State Millions

0.8182 Low 2

Invest 2 0.1136 Medium 5

0.0682 High 11Low

0.5000 A 3Invest 1

0.5000 B 6

0.2200 0.0227 Low 2

Invest 2 0.9091 Med 5

0.0682 High 11Hire Consultants 0.4400 Med

0.5000 A 3Invest 1

0.5000 B 60.3400

0.0294 Low 2

Invest 2 0.2206 Med 5

Start 0.7500 High 11High

0.5000 A 3Invest 1

0.5000 B 6

0.2000 Low 3

Invest 2 0.5000 Med 6

Don't Hire Consultants 0.3000 High 12

0.5000 A 4Invest 1

0.5000 B 7The first step in simulating the above tree is to devise a way to simulate the

uncertain outcomes, that is, the outcomes over which we have no control. In the above tree there are three such outcomes. These are identified by branches labeled with probabilities. Looking at the tree from left to right, we first encounter the uncertainty in what the consultants will say. That is:

Page 71: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Consultants CumulativeSay Probability Probability

0.00

Low 0.22 0.22

Medium 0.44 0.66

High 0.34 1.00 ----------------

1

To simulate this distribution we need only simulate a random number using the =rand() function in EXCEL. If the random number is between 0.00 and 0.22, then the outcome corresponds to the Consultants saying “Low”. If it is between 0.22 and 0.66, then the outcome corresponds to the Consultants saying “Medium” and if the outcome is greater than .66, the outcome corresponds to the Consultants saying “High”.

Page 72: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Again moving from left to right, we encounter the two possible outcomes, A and B. For Investment 1, the probability table would look like:

CumulativeOutcome Probability Probability

0.00

A 0.5 0.50

B 0.5 1.00 ----------------

1

In order to simulate this outcome, we need to generate one random number and compare it to 0.50. If it is less than or equal to 0.50, then we indicate that Outcome A has occurred, otherwise outcome B occurs.

Page 73: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Simulation of Investment 2 is more complicated since the probability distribution depends on what the consultants say. For example if the Consultants say “Low”, then we simulate the Actual State from the following table:

Consultants say Low

ActualState Cumulative

Outcome Probability Probability

0.0000

Low 0.8182 0.8182

Medium 0.1136 0.9318

High 0.0682 1.0000 ----------------

1

If the generated random number is between 0.0000 and 0.8182, then we record that the Actual State was Low; if the random number is between 0.8182 and 0.9318, we record that the Actual State was Medium; and if the random number is greater than 0.9318 we record that the Actual State was High.

Page 74: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

If the Consultants say Medium, then the appropriate distribution to simulate is:

Consultants say Medium

ActualState Cumulative

Outcome Probability Probability

0.0000

Low 0.0227 0.0227

Medium 0.9091 0.9318

High 0.0682 1.0000 ----------------

1

If the generated random number is between 0.0000 and 0.0227, then we record that the Actual State was Low; if the random number is between 0.0227 and 0.9318, we record that the Actual State was Medium; and if the random number is greater than 0.9318 we record that the Actual State was High.

Page 75: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Finally, if the consultants say High, then the appropriate distribution for the simulation is given by:

Consultants say High

ActualState Cumulative

Outcome Probability Probability

0.0000

Low 0.0294 0.0294

Medium 0.2206 0.2500

High 0.7500 1.0000 ----------------

1

If the generated random number is between 0.0000 and 0.0294, then we record that the Actual State was Low; if the random number is between 0.0294 and 0.2500, we record that the Actual State was Medium; and if the random number is greater than 0.2500 we record that the Actual State was High.

Page 76: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

In the EXCEL file “simulatedecision.xls”, the above procedures are implemented. The results for ten simulations are shown below:

Random Components

Investment Consultant Actual Repeat 1 Says State

1 B Med Med2 A Med Med3 A High High4 A High High5 A Med High6 A Med Med7 B Med Med8 B High Med9 B Med Med10 B Med Med

We have now finished with all of the random components of the tree, that is, we can now simulate the outcomes over which the manager has no control. Now it is necessary to list the possible decision that the manager can make.

Page 77: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Again looking at the decision tree from left to right, the first major decision is whether or not to hire the Consultants. Following the “Don’t Hire Consultants” branch, the manager must next decide whether to pick Investment 1 or Investment 2.

On the other hand, if one decides to hire the consultants, the consultants will give their report and then the manager must decide between Investment 1 and Investment 2. Since we do not know what the consultant’s will say, a strategy must specify what investment the manager will pick for every possible prediction the consultant’s could make. That is, we must specify an investment choice for the three possible predictions of Low, Medium, and High. Since there are two possible choices for each of the consultant’s possible predictions, there are possible strategies that must be specified. These are shown below:

Manager Decision Strategies

Investment PickIf Consultants Say

Low Medium HighStrategy

I 1 1 1II 2 2 2III 1 2 2IV 2 1 2V 2 2 1VI 1 1 2VII 1 2 1VIII 2 1 1

In total there are 10 possible strategies that the manager can make, two if one doesn’t hire the consultants and eight if one does.

Page 78: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

One now adds ten columns, one for each strategy, to the random columns that we specified before. The results would look like:

Simulation of the Investment Decision

Random Components Possible Decisions

Investment Consultant Actual No Consult No Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire ConsultRepeat 1 Says State Invest 1 Invest 2 I II III IV V VI VII VIII

Low = 1 Low = 2 Low = 1 Low = 2 Low = 2 Low = 1 Low = 1 Low = 2Med = 1 Med = 2 Med = 2 Med = 1 Med = 2 Med = 1 Med = 2 Med = 1High = 1 High = 2 High = 2 High = 2 High = 1 High = 2 High = 1 High = 1

1 B Med Med 7 6 6 5 5 6 5 6 5 62 A Med Med 4 6 3 5 5 3 5 3 5 33 A High High 4 12 3 11 11 11 3 11 3 34 A High High 4 12 3 11 11 11 3 11 3 35 A Med High 4 12 3 11 11 3 11 3 11 36 A Med Med 4 6 3 5 5 3 5 3 5 37 B Med Med 7 6 6 5 5 6 5 6 5 68 B High Med 7 6 6 5 5 5 6 5 6 69 B Med Med 7 6 6 5 5 6 5 6 5 610 B Med Med 7 6 6 5 5 6 5 6 5 6

Examine “Repeat 1”. For the No Consultant Investment 1 strategy, the simulation indicated that outcome B occurred for investment 1, so are gain would have been 7 (million dollars). For the No Consultant Investment 2 strategy, the actual state was Medium so we would gain 6 (million dollars). The other eight strategies all involved hiring the consultants at a cost of 1 (million dollars). The simulation indicates that the Consultants said Medium so for Hire Consultant Strategies I, IV, VI and VIII we would gain 6 (million dollars) since the Investment 1 result was B which paid 7 (million dollars) but we paid one million for the consultants. For the Hire Consultant Strategies II, III, V, and VII we would gain 5 (million dollars) since we chose Investment 2 which pays 6 (million dollars) for the Medium actual state but the cost of the consultants was one million dollars. Try working out the results for “Repeat 5”.

Page 79: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

Below is an example of fifty repetitions of the simulation along with the average, standard deviation and ratio of mean to standard deviation for all ten strategies.

Simulation of the Investment Decision

Random Components Possible Decisions

InvestmentConsultant Actual No Consult No Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire ConsultRepeat 1 Says State Invest 1 Invest 2 I II III IV V VI VII VIII

Low = 1 Low = 2 Low = 1 Low = 2 Low = 2 Low = 1 Low = 1 Low = 2Med = 1 Med = 2 Med = 2 Med = 1 Med = 2 Med = 1 Med = 2 Med = 1High = 1 High = 2 High = 2 High = 2 High = 1 High = 2 High = 1 High = 1

1 B Low Med 7 6 6 5 6 5 5 6 6 52 B High Med 7 6 6 5 5 5 6 5 6 63 B Med Med 7 6 6 5 5 6 5 6 5 64 B Med Med 7 6 6 5 5 6 5 6 5 65 A Med Med 4 6 3 5 5 3 5 3 5 36 A Med High 4 12 3 11 11 3 11 3 11 37 A Med Med 4 6 3 5 5 3 5 3 5 38 A Med Med 4 6 3 5 5 3 5 3 5 39 A High Med 4 6 3 5 5 5 3 5 3 310 A High High 4 12 3 11 11 11 3 11 3 311 A Med Med 4 6 3 5 5 3 5 3 5 312 A High High 4 12 3 11 11 11 3 11 3 313 B Low Low 7 3 6 2 6 2 2 6 6 214 A High High 4 12 3 11 11 11 3 11 3 315 A Med Med 4 6 3 5 5 3 5 3 5 316 B High High 7 12 6 11 11 11 6 11 6 617 B High Med 7 6 6 5 5 5 6 5 6 618 B Low Low 7 3 6 2 6 2 2 6 6 219 A High Med 4 6 3 5 5 5 3 5 3 320 B Low Low 7 3 6 2 6 2 2 6 6 221 A High High 4 12 3 11 11 11 3 11 3 322 A Med Med 4 6 3 5 5 3 5 3 5 323 A Low Med 4 6 3 5 3 5 5 3 3 524 B Med Med 7 6 6 5 5 6 5 6 5 625 A High High 4 12 3 11 11 11 3 11 3 326 A High High 4 12 3 11 11 11 3 11 3 327 B Low Low 7 3 6 2 6 2 2 6 6 228 A Low Low 4 3 3 2 3 2 2 3 3 229 B High High 7 12 6 11 11 11 6 11 6 630 B Med Low 7 3 6 2 2 6 2 6 2 631 B Med Med 7 6 6 5 5 6 5 6 5 632 B Med Med 7 6 6 5 5 6 5 6 5 633 B Med Med 7 6 6 5 5 6 5 6 5 634 B High High 7 12 6 11 11 11 6 11 6 635 B Low Low 7 3 6 2 6 2 2 6 6 236 A Med Med 4 6 3 5 5 3 5 3 5 337 A Med Med 4 6 3 5 5 3 5 3 5 338 B High High 7 12 6 11 11 11 6 11 6 639 B Med Med 7 6 6 5 5 6 5 6 5 640 A High High 4 12 3 11 11 11 3 11 3 341 A Low Low 4 3 3 2 3 2 2 3 3 242 B High Med 7 6 6 5 5 5 6 5 6 643 B Med Med 7 6 6 5 5 6 5 6 5 644 B High High 7 12 6 11 11 11 6 11 6 645 B Med Med 7 6 6 5 5 6 5 6 5 646 A Med Med 4 6 3 5 5 3 5 3 5 347 B High High 7 12 6 11 11 11 6 11 6 648 A Low Low 4 3 3 2 3 2 2 3 3 249 A Med Med 4 6 3 5 5 3 5 3 5 350 A High High 4 12 3 11 11 11 3 11 3 3

Average 5.5 7.26 4.5 6.26 6.7 5.96 4.36 6.4 4.8 4.06S.D. 1.52 3.32 1.52 3.32 2.95 3.45 1.72 3.14 1.53 1.63

Ratio 3.63 2.18 2.97 1.88 2.27 1.73 2.53 2.04 3.15 2.48

Notice that the highest average of 7.26 million is given by the strategy Don’tHire Consultants and go with Investment 2. This agrees with our earlier theoretical results. Similarly, the highest ratio is achieved by the strategy Don’t Hire Consultants and go with Investment 1, which also agrees with our earlier theoretical results.

Page 80: Module II - The University of Texas at Dallaswiorkow/documents/ModIILect4.doc · Web viewModule II Lecture 4 Special Probability Distributions Certain probability distributions occur

A simulation of fifty cases is quite small and sometimes the results deviate from theory. For example in the fifty cases simulated below, the highest ratio is computed for Strategy VII rather than the strategy Don’t Hire Consultants and go with Investment I. However, most of the time, Strategy VII does not have the highest ratio. It is important, when simulating data, to make sure that you simulate the situation several hundred times so that random deviations from theory do not mislead decision making.

Simulation of the Investment Decision

Random Components Possible Decisions

InvestmentConsultant Actual No Consult No Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire Consult Hire ConsultRepeat 1 Says State Invest 1 Invest 2 I II III IV V VI VII VIII

Low = 1 Low = 2 Low = 1 Low = 2 Low = 2 Low = 1 Low = 1 Low = 2Med = 1 Med = 2 Med = 2 Med = 1 Med = 2 Med = 1 Med = 2 Med = 1High = 1 High = 2 High = 2 High = 2 High = 1 High = 2 High = 1 High = 1

1 A Low Low 4 3 3 2 3 2 2 3 3 22 A Low Med 4 6 3 5 3 5 5 3 3 53 B Med Med 7 6 6 5 5 6 5 6 5 64 B Med Med 7 6 6 5 5 6 5 6 5 65 B High High 7 12 6 11 11 11 6 11 6 66 B Low Low 7 3 6 2 6 2 2 6 6 27 B Med Med 7 6 6 5 5 6 5 6 5 68 A Med Med 4 6 3 5 5 3 5 3 5 39 A High Med 4 6 3 5 5 5 3 5 3 310 A High High 4 12 3 11 11 11 3 11 3 311 A Low Low 4 3 3 2 3 2 2 3 3 212 B Med Med 7 6 6 5 5 6 5 6 5 613 A High Med 4 6 3 5 5 5 3 5 3 314 A Med Med 4 6 3 5 5 3 5 3 5 315 B Med Med 7 6 6 5 5 6 5 6 5 616 A High High 4 12 3 11 11 11 3 11 3 317 A Med Med 4 6 3 5 5 3 5 3 5 318 A Med Med 4 6 3 5 5 3 5 3 5 319 A High High 4 12 3 11 11 11 3 11 3 320 A Med Med 4 6 3 5 5 3 5 3 5 321 A High Low 4 3 3 2 2 2 3 2 3 322 A High High 4 12 3 11 11 11 3 11 3 323 B High High 7 12 6 11 11 11 6 11 6 624 B Med Med 7 6 6 5 5 6 5 6 5 625 B Med Med 7 6 6 5 5 6 5 6 5 626 A Med Med 4 6 3 5 5 3 5 3 5 327 B Med Med 7 6 6 5 5 6 5 6 5 628 B Low High 7 12 6 11 6 11 11 6 6 1129 B Low Low 7 3 6 2 6 2 2 6 6 230 B Low Low 7 3 6 2 6 2 2 6 6 231 B High High 7 12 6 11 11 11 6 11 6 632 B Med Med 7 6 6 5 5 6 5 6 5 633 A Med Med 4 6 3 5 5 3 5 3 5 334 B Low Low 7 3 6 2 6 2 2 6 6 235 B Med Med 7 6 6 5 5 6 5 6 5 636 B Med Med 7 6 6 5 5 6 5 6 5 637 A Low Low 4 3 3 2 3 2 2 3 3 238 A Low Low 4 3 3 2 3 2 2 3 3 239 B Med Med 7 6 6 5 5 6 5 6 5 640 B Med Med 7 6 6 5 5 6 5 6 5 641 B High High 7 12 6 11 11 11 6 11 6 642 A Low Low 4 3 3 2 3 2 2 3 3 243 A High High 4 12 3 11 11 11 3 11 3 344 B Med Med 7 6 6 5 5 6 5 6 5 645 A Low Low 4 3 3 2 3 2 2 3 3 246 A Low High 4 12 3 11 3 11 11 3 3 1147 B High High 7 12 6 11 11 11 6 11 6 648 B Med Med 7 6 6 5 5 6 5 6 5 649 B Med Med 7 6 6 5 5 6 5 6 5 650 A High High 4 12 3 11 11 11 3 11 3 3

Average 5.56 6.9 4.56 5.9 6.04 5.94 4.38 6.08 4.52 4.42S.D. 1.51 3.28 1.51 3.28 2.81 3.40 1.94 2.95 1.16 2.17

Ratio 3.67 2.10 3.01 1.80 2.15 1.75 2.26 2.06 3.88 2.04