19
SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Embed Size (px)

DESCRIPTION

Learning Objectives 1.Population and Samples 2.Point Estimates vs. Confidence Interval Estimates 3.Calculating Confidence Intervals

Citation preview

Page 1: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

SESSION 39 & 40

Last Update11th May 2011

Continuous Probability Distributions

Page 2: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Lecturer: Florian BoehlandtUniversity: University of Stellenbosch Business SchoolDomain: http://www.hedge-fund-analysis.net/pages/ve

ga.php

Page 3: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Learning Objectives

1. Population and Samples2. Point Estimates vs. Confidence Interval

Estimates3. Calculating Confidence Intervals

Page 4: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Normal Probabilities

Often it may be prohibitively expensive to obtain information on all member of a population. Thus, market researchers usually collect information from a sample or sub-set of the population. The sample statistics (e.g. the sample mean) are calculated and used to estimate the population parameters (e.g. the population mean). This process is know as statistical inference.

Page 5: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Notation

The notation for sample statistics and population parameters is given in the table below:

Size

Mean

Standard Deviation

Proportion

Population

Parameters

N

μ

σ

P

Sample

Statistics

n

x

s

p

Page 6: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Inference

Sample Statistic

Point Estimate = Sample Statistic

Confidence Interval Estimate

Unknown Population Parameter

A point estimator draws inferences about the population by estimating the value of an unknown parameter using a single value or point

An interval estimator draws inferences about the population by estimating the value of an unknown parameter using an interval

Page 7: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Common confidence intervals include:- 90 % Weak statistical

evidence- 95% Strong statistical

evidence- 99% Overwhelming

statistical evidence

Page 8: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Central Limit Theorem

The sampling distribution of the mean of a random sample drawn from any population is approximately normal for sufficiently large sample sizes. The larger the sample size, the more closely the sampling distribution of x-bar will resemble the normal distribution.This is an important notation since it allows for using the normal distribution to describe the dispersion of sample means. Example: Tossing n dies and recording the average results

Page 9: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Sampling Distribution

It can be shown that the sampling distribution is described as follows:

If X is normal. X-bar is normal. If X is nonnormal, X-bar is approximately normal for sufficiently large sample sizes.So for the sampling distribution:

Changes to:

Page 10: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Example

Suppose that the amount of time to assemble a computer is normally distributed with a mean μ = 50 minutes and a standard deviation σ = 10 minutes. a) What is the probability that one randomly selected computer

is assembled in a time less than 60 minutes?b) What is the probability that four randomly selected

computers have a mean assembly time of less than 60 minutes?

Page 11: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Solution

a) b)

60 60

50 50

10 10

1 4

The associated probabilities are P(Z < 1) = 0.8413 and P(Z < 2) = 0.9772 respectively.

Page 12: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Sampling Distribution and Inference

The 95% confidence interval (i.e. the area underneath the graph) for the standard normal distribution is expressed algebraically: With the definition of Z for the sampling distribution:

Rearrangement yields:

Or for the general case:

If X is normal. X-bar is normal. If X is nonnormal, X-bar is approximately normal for sufficiently large sample sizes.So for the sampling distribution:

Changes to:

The smaller-than term is referred to as Lower-Confidence-Limit (LCL) and the larger-than term as Upper-Confidence-Limit (UCL)

Page 13: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Example

Suppose that the average assembly time across n = 25 computers is X-bar = 50 minutes. In addition, we assume that the population standard deviation is known and is equal to σ = 10 minutes. What is the 95% confidence interval?

Comment: α = 1 – CL. Here, α = 1 – 0.95 = 0.05 (or 5%). Thus, α/2 = 0.025.

Page 14: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Solution

LCL and UCL

1.96

50

10

25

Thus, the LCL = 53.92 and the UCL = 46.08. The interpretation is straight-forward: For n = 25 with σ = 10, there is a 95% chance that the true population mean μ falls in between the LCL = 53.92 and the UCL = 46.08.

Page 15: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Finding zα/2

Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.071.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.35771.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.37901.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.39801.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.41471.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.42921.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.44181.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.45251.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.46161.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.46931.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.47562.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808

Since CL = 0.95, α = 1 – 0.92 = 0.05. Then α / 2 = 0.025. For one half of the standard normal distribution table, this corresponds to 0.5 – 0.025 = 0.4750 = P(Z < 1.96). Thus, zα/2 = 1.96.

Represents 2.5% of the area underneath the chart

Page 16: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Normal Approximation of the Binomial Distribution

The binomial distribution may be approximated using the normal distribution. A graphical derivation of this is included in most statistics textbooks and is omitted here. The upside is that the normal approximation allows us to calculate confidence intervals for the binomial distribution It can be shown that the sampling distribution is described as follows:

where p-hat is the proportion of successes in a Bernoulli trial process estimated from the statistical sample.

Page 17: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Confidence Interval Binomial Distribution

Replacing E(P-hat) for μ and the standard error σ / √n with the standard error of the proportion in the formula for the confidence interval yields:

Page 18: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Example

In a survey including 1000 people, a political candidate received 52% of the votes cast. What is the 95% confidence interval associated with this result?

Page 19: SESSION 39 & 40 Last Update 11 th May 2011 Continuous Probability Distributions

Solution

LCL and UCL

1.96

0.52

0.48

1000

Thus, the LCL = 0.504 and the UCL = 0.536. Note that the LCL is in excess of 0.5 (i.e. from the sample, there is strong evidence to infer that the candidate may win the election).