SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

SESSION 39 & 40

Last Update11th May 2011

Continuous Probability Distributions

Lecturer: Florian BoehlandtUniversity: University of Stellenbosch Business SchoolDomain: http://www.hedge-fund-analysis.net/pages/ve

ga.php

http://www.hedge-fund-analysis.net/

http://www.hedge-fund-analysis.net/

Learning Objectives

1. Population and Samples2. Point Estimates vs. Confidence Interval

Estimates3. Calculating Confidence Intervals

Normal Probabilities

Often it may be prohibitively expensive to obtain information on all member of a population. Thus, market researchers usually collect information from a sample or sub-set of the population. The sample statistics (e.g. the sample mean) are calculated and used to estimate the population parameters (e.g. the population mean). This process is know as statistical inference.

Notation

The notation for sample statistics and population parameters is given in the table below:

Size

Mean

Standard Deviation

Proportion

Population

Parameters

N

μ

σ

P

Sample

Statistics

n

x

s

p

Inference

Sample Statistic

Point Estimate = Sample Statistic

Confidence Interval Estimate

Unknown Population Parameter

A point estimator draws inferences about the population by estimating the value of an unknown parameter using a single value or point

An interval estimator draws inferences about the population by estimating the value of an unknown parameter using an interval

Common confidence intervals include:- 90 % Weak statistical

evidence- 95% Strong statistical

evidence- 99% Overwhelming

statistical evidence

Central Limit Theorem

The sampling distribution of the mean of a random sample drawn from any population is approximately normal for sufficiently large sample sizes. The larger the sample size, the more closely the sampling distribution of x-bar will resemble the normal distribution.This is an important notation since it allows for using the normal distribution to describe the dispersion of sample means. Example: Tossing n dies and recording the average results

Sampling Distribution

It can be shown that the sampling distribution is described as follows:

If X is normal. X-bar is normal. If X is nonnormal, X-bar is approximately normal for sufficiently large sample sizes.So for the sampling distribution:

Changes to:

Example

Suppose that the amount of time to assemble a computer is normally distributed with a mean μ = 50 minutes and a standard deviation σ = 10 minutes. a) What is the probability that one randomly selected computer

is assembled in a time less than 60 minutes?b) What is the probability that four randomly selected

computers have a mean assembly time of less than 60 minutes?

Solution

a) b)

60 60

50 50

10 10

1 4

The associated probabilities are P(Z < 1) = 0.8413 and P(Z < 2) = 0.9772 respectively.

Sampling Distribution and Inference

The 95% confidence interval (i.e. the area underneath the graph) for the standard normal distribution is expressed algebraically: With the definition of Z for the sampling distribution:

Rearrangement yields:

Or for the general case:

If X is normal. X-bar is normal. If X is nonnormal, X-bar is approximately normal for sufficiently large sample sizes.So for the sampling distribution:

Changes to:

The smaller-than term is referred to as Lower-Confidence-Limit (LCL) and the larger-than term as Upper-Confidence-Limit (UCL)

Example

Suppose that the average assembly time across n = 25 computers is X-bar = 50 minutes. In addition, we assume that the population standard deviation is known and is equal to σ = 10 minutes. What is the 95% confidence interval?

Comment: α = 1 – CL. Here, α = 1 – 0.95 = 0.05 (or 5%). Thus, α/2 = 0.025.

Solution

LCL and UCL

1.96

50

10

25

Thus, the LCL = 53.92 and the UCL = 46.08. The interpretation is straight-forward: For n = 25 with σ = 10, there is a 95% chance that the true population mean μ falls in between the LCL = 53.92 and the UCL = 46.08.

Finding zα/2

Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.071.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.35771.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.37901.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.39801.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.41471.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.42921.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.44181.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.45251.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.46161.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.46931.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.47562.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808

Since CL = 0.95, α = 1 – 0.92 = 0.05. Then α / 2 = 0.025. For one half of the standard normal distribution table, this corresponds to 0.5 – 0.025 = 0.4750 = P(Z < 1.96). Thus, zα/2 = 1.96.

Represents 2.5% of the area underneath the chart

Normal Approximation of the Binomial Distribution

The binomial distribution may be approximated using the normal distribution. A graphical derivation of this is included in most statistics textbooks and is omitted here. The upside is that the normal approximation allows us to calculate confidence intervals for the binomial distribution It can be shown that the sampling distribution is described as follows:

where p-hat is the proportion of successes in a Bernoulli trial process estimated from the statistical sample.

Confidence Interval Binomial Distribution

Replacing E(P-hat) for μ and the standard error σ / √n with the standard error of the proportion in the formula for the confidence interval yields:

Example

In a survey including 1000 people, a political candidate received 52% of the votes cast. What is the 95% confidence interval associated with this result?

Solution

LCL and UCL

1.96

0.52

0.48

1000

Thus, the LCL = 0.504 and the UCL = 0.536. Note that the LCL is in excess of 0.5 (i.e. from the sample, there is strong evidence to infer that the candidate may win the election).

Documents

SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions