Upload
derek-barber
View
215
Download
0
Embed Size (px)
DESCRIPTION
Learning Objectives 1.Population and Samples 2.Point Estimates vs. Confidence Interval Estimates 3.Calculating Confidence Intervals
Citation preview
SESSION 39 & 40
Last Update11th May 2011
Continuous Probability Distributions
Lecturer: Florian BoehlandtUniversity: University of Stellenbosch Business SchoolDomain: http://www.hedge-fund-analysis.net/pages/ve
ga.php
Learning Objectives
1. Population and Samples2. Point Estimates vs. Confidence Interval
Estimates3. Calculating Confidence Intervals
Normal Probabilities
Often it may be prohibitively expensive to obtain information on all member of a population. Thus, market researchers usually collect information from a sample or sub-set of the population. The sample statistics (e.g. the sample mean) are calculated and used to estimate the population parameters (e.g. the population mean). This process is know as statistical inference.
Notation
The notation for sample statistics and population parameters is given in the table below:
Size
Mean
Standard Deviation
Proportion
Population
Parameters
N
μ
σ
P
Sample
Statistics
n
x
s
p
Inference
Sample Statistic
Point Estimate = Sample Statistic
Confidence Interval Estimate
Unknown Population Parameter
A point estimator draws inferences about the population by estimating the value of an unknown parameter using a single value or point
An interval estimator draws inferences about the population by estimating the value of an unknown parameter using an interval
Common confidence intervals include:- 90 % Weak statistical
evidence- 95% Strong statistical
evidence- 99% Overwhelming
statistical evidence
Central Limit Theorem
The sampling distribution of the mean of a random sample drawn from any population is approximately normal for sufficiently large sample sizes. The larger the sample size, the more closely the sampling distribution of x-bar will resemble the normal distribution.This is an important notation since it allows for using the normal distribution to describe the dispersion of sample means. Example: Tossing n dies and recording the average results
Sampling Distribution
It can be shown that the sampling distribution is described as follows:
If X is normal. X-bar is normal. If X is nonnormal, X-bar is approximately normal for sufficiently large sample sizes.So for the sampling distribution:
Changes to:
Example
Suppose that the amount of time to assemble a computer is normally distributed with a mean μ = 50 minutes and a standard deviation σ = 10 minutes. a) What is the probability that one randomly selected computer
is assembled in a time less than 60 minutes?b) What is the probability that four randomly selected
computers have a mean assembly time of less than 60 minutes?
Solution
a) b)
60 60
50 50
10 10
1 4
The associated probabilities are P(Z < 1) = 0.8413 and P(Z < 2) = 0.9772 respectively.
Sampling Distribution and Inference
The 95% confidence interval (i.e. the area underneath the graph) for the standard normal distribution is expressed algebraically: With the definition of Z for the sampling distribution:
Rearrangement yields:
Or for the general case:
If X is normal. X-bar is normal. If X is nonnormal, X-bar is approximately normal for sufficiently large sample sizes.So for the sampling distribution:
Changes to:
The smaller-than term is referred to as Lower-Confidence-Limit (LCL) and the larger-than term as Upper-Confidence-Limit (UCL)
Example
Suppose that the average assembly time across n = 25 computers is X-bar = 50 minutes. In addition, we assume that the population standard deviation is known and is equal to σ = 10 minutes. What is the 95% confidence interval?
Comment: α = 1 – CL. Here, α = 1 – 0.95 = 0.05 (or 5%). Thus, α/2 = 0.025.
Solution
LCL and UCL
1.96
50
10
25
Thus, the LCL = 53.92 and the UCL = 46.08. The interpretation is straight-forward: For n = 25 with σ = 10, there is a 95% chance that the true population mean μ falls in between the LCL = 53.92 and the UCL = 46.08.
Finding zα/2
Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.071.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.35771.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.37901.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.39801.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.41471.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.42921.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.44181.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.45251.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.46161.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.46931.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.47562.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808
Since CL = 0.95, α = 1 – 0.92 = 0.05. Then α / 2 = 0.025. For one half of the standard normal distribution table, this corresponds to 0.5 – 0.025 = 0.4750 = P(Z < 1.96). Thus, zα/2 = 1.96.
Represents 2.5% of the area underneath the chart
Normal Approximation of the Binomial Distribution
The binomial distribution may be approximated using the normal distribution. A graphical derivation of this is included in most statistics textbooks and is omitted here. The upside is that the normal approximation allows us to calculate confidence intervals for the binomial distribution It can be shown that the sampling distribution is described as follows:
where p-hat is the proportion of successes in a Bernoulli trial process estimated from the statistical sample.
Confidence Interval Binomial Distribution
Replacing E(P-hat) for μ and the standard error σ / √n with the standard error of the proportion in the formula for the confidence interval yields:
Example
In a survey including 1000 people, a political candidate received 52% of the votes cast. What is the 95% confidence interval associated with this result?
Solution
LCL and UCL
1.96
0.52
0.48
1000
Thus, the LCL = 0.504 and the UCL = 0.536. Note that the LCL is in excess of 0.5 (i.e. from the sample, there is strong evidence to infer that the candidate may win the election).