109
Chapter 4 Modeling and Analysis of Uncertainty

Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Embed Size (px)

Citation preview

Page 1: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Chapter 4

Modeling and Analysis of Uncertainty

Page 2: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Statistical Inference

• Statistical inference is using the information contained in a sample to make decisions about the population. Statistical inference has two sub areas:

1. Parameter estimation – the estimating of a parameter of a population probability distribution. Point and Interval estimates.

2. Hypothesis testing – the testing of a hypothesis made regarding a parameter of a probability distribution.

Page 3: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Point Estimation• If X is a random variable with a probability

distribution f (x), and the distribution has an unknown parameter θ, and if X1..Xn is a random sample of size n from f (x)

.ˆ statistic a of ˆ valuenumerical

single a is parameter population some of estimatepoint A 3.

.ˆ as toreferred

is that valuenumerical a has ˆ and valuesnumerical have

variablesrandom the,been taken has sample random After the 2.

variable.random a is ˆ Therefore

sample. in the ) variablesrandom (all nsobservatio theoffunction a ish

. ofestimator point a called is X,X,Xhˆ .1 n21

Page 4: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

)ˆ( EstimatesPoint and Parameters

Population Mean μ ------------Sample mean

Population Variance σ2 ----------Sample variance S2

Difference in two means μ1–μ2----Difference in two sample means

All of these point estimates are statistics.

X

21 XX

Page 5: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

A. p-quantile of X and denote the p-quantile of the standard normal. Assume X~N(μ, σ2) . Then

Let denote thepx

pX p

pp

p p

xF x p p

xz

x z

pz

(3)

Thus, we would estimate bypxˆ ˆ ˆ .p p px z x sz (4)

Estimator for Percentiles

Page 6: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

B. When sampling from X exp(),

2 22

1 1, , ln 1 ln 1 .X X p Xx p p

Thus,

1ˆ ˆ ˆ, , and ln 1 . px x x px

What if we wanted to estimateWe wouldn’t use s2, we would use

2 ? X

2 2ˆ .X x

Estimators for Percentiles

Page 7: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 1: The data (ordered) are thicknesses of copper foil from a rolling operation (units in mils).

Average: 5.04StDev: 0.165745N: 15

Anderson-Darling Normality TestA-Squared: 0.159P-Value: 0.936

4.7 4.8 4.9 5.0 5.1 5.2 5.3 5.4

.001

.01

.05

.20

.50

.80

.95

.99

.999

Pro

babi

lity

THICKNESS

Normal Probability PlotTHICKNESS4.724.844.874.944.985.005.015.055.065.075.115.125.215.245.38

Estimators for Percentiles

Page 8: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

The normal distribution provides a good fit to the data. Thus, assume the data come from a normal distribution, with

ˆ ˆ5.04, 0.1657.x s

The 90th percentile is estimated by

0.90 0.90ˆ 5.04 0.1657(1.2816) 5.25x x sz

(To obtain standard normal quantiles use normal tables or MINITAB or find normal quantiles directly from MINITAB –we’ll demonstrate).

The sample 90th percentile (see Chapter 2) is 5.30.

Estimators for Percentiles

Page 9: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Unbiased Estimators

• Bias is a measure of closeness of the estimator to the parameter. It is defined by:

The point estimator is an unbiased estimator for the parameter θ if:

E( ) = θ • If the estimator is not unbiased, then the

difference E( ) - θ is called the bias of the estimator .

In words: “An estimator is unbiased when its expected value is equal to the parameter it estimates.”

Page 10: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Pdf of a biased estimator 2

Pdf of an unbiased estimator 1

1

2

ˆ 0

ˆ

B

B

“On the average” whenan error of size is made.

2ˆ is used,

Point Estimation

Page 11: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 2: Recall that for a random sample from a population with mean 2 and variance ,

2 2 and .E X E S

Thus, the sample mean (variance) is always unbiased for the population mean (variance).

But note – the sample standard deviation is never unbiased for :

2 2E S E S E S

However, when sampling from the normal distribution, for sample sizes 10 the bias is very small.

Page 12: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Remark: In the expression for sample variance,

2

2 1

1

n

ii

X XS

n

why not use n in the denominator? Because by using n – 1, we get an unbiased estimator for the population variance.

Whether or not an estimator is unbiased, the smaller the variance the better.

Page 13: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Point Estimation

Page 14: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Estimator Variance

• When there are two unbiased estimators for the same parameter, a decision as to which is better can be made by examining the variance of each.

• The estimator with the least variance is more likely to result in an estimate that is closer to the population parameter.

• The unbiased estimator with the least variance is referred to as the minimum variance unbiased estimator (MVUE).

Page 15: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Mean Square Error

• A measure that incorporates both the bias and the variance of an estimator is the mean square error.

• The mean square error of an estimator

of the parameter θ is defined as :

MSE ( ) = E( - θ)2 • The mean square error can be rewritten as:• MSE ( ) = V( )+(bias)2

Page 16: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Relative Efficiency

• The mean square error can be used to compare the ability of two estimators to estimate a parameter.

• For two estimators , , the relative efficiency of to is:

1 22

2

1

ˆMSE

ˆMSE

1

A ratio less than 1 indicates that is not as efficient as in estimating the parameter value .

21

Page 17: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

1. The MSE of an unbiased estimator is simply 2. An estimator with small variance and small

bias may be more efficient than one with a larger variance and no bias, as depicted below.

Pdf of unbiased estimator 1

Pdf of a biased estimator

2

Which estimator is better? Why?

ˆVar .

Page 18: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Standard Error

• The standard error of a statistic is the standard deviation of its sampling distribution.

• If the expression for the standard error contains its own parameters with unknown values, but which have estimators, use the estimators and the result is the estimated standard error of a statistic.

V

Page 19: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Point Estimation

Page 20: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

We like to think of statistical hypothesis testing as the data analysis stage of a comparative experiment, in which the engineer is interested, for example, in comparing the mean of a population to a specified value (e.g. mean pull strength).

Statistical Hypotheses

Page 21: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

• A statistical hypothesis is a statement about the parameters of one or more populations. They are expressed

H0: μ = 50cm/s the null hypothesis

H1: μ ≠ 50cm/s the alternative hypothesis

• Generally the interest is in whether or not a parameter has changed, is in accordance with theory, or is in conformance with a requirement.

Page 22: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing cont.

• Testing the hypothesis involves taking a random sample, computing a test statistic, and from the test statistic make a decision about the null hypothesis.

• The value of the test statistic is compared to the critical region, and if it is in the critical region the null hypothesis is rejected.

Page 23: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Type I Error• The limits of the critical region are referred to as

the critical values.• These critical values depend on the desired

probability for rejecting the null hypothesis when in fact it is true.

• Rejecting the null hypothesis H0 when it is true is defined as a type I error.

• The probability of a type I error is denoted as α. P( reject Ho when Ho is true)

• The type I error probability is also referred to as the significance level or size of the test.

• Generally the α value is decided prior to conducting the test.

Page 24: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Type II Error

• Failing to reject the null hypothesis when it is false is referred to as a Type II error.

• The probability of making a type II error is denoted as β. P( fail to reject H0 when H0 is false).

• This probability is a function of 2 values. First, how close the hypothesized value of the parameter is to the actual value of the parameter, and second, the sample size.

• The closer the two values of the parameter and the smaller the sample size, the larger the value of β.

• To calculate β, a true population parameter value is required.

Page 25: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Power of a test

• The power of a statistical test is the probability of rejecting the null hypothesis when it is false.

• This probability is 1-β.

• If the power value is deemed to low, increase α or the sample size.

Page 26: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

For example, suppose that we are interested in the burning rate of a solid propellant used to power aircrew escape systems.

• Now burning rate is a random variable that can be described by a probability distribution.

• Suppose that our interest focuses on the mean burning rate (a parameter of this distribution).

• Specifically, we are interested in deciding whether or not the mean burning rate is 50 centimeters per second.

Statistical Hypotheses

Page 27: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.1 Statistical HypothesesTwo-sided Alternative Hypothesis

One-sided Alternative Hypotheses

Page 28: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.1 Statistical HypothesesTest of a Hypothesis • A procedure leading to a decision about a particular hypothesis

• Hypothesis-testing procedures rely on using the information in a random sample from the population of interest.

• If this information is consistent with the hypothesis, then we will conclude that the hypothesis is true; if this information is inconsistent with the hypothesis, we will conclude that the hypothesis is false.

Page 29: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 30: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 31: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Sometimes the type I error probability is called the significance level, or the -error, or the size of the test.

Page 32: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 33: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 34: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-3 Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 35: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

Page 36: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.2 Testing Statistical Hypotheses

• The power is computed as 1 - , and power can be interpreted as the probability of correctly rejecting a false null hypothesis. We often compare statistical tests by comparing their power properties.

• For example, consider the propellant burning rate problem whenwe are testing H 0 : = 50 centimeters per second against H 1 : not equal 50 centimeters per second . Suppose that the true value of the mean is = 52. When n = 10, we found that = 0.2643, so the power of this test is 1 - = 1 - 0.2643 = 0.7357 when = 52.

Page 37: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.3 One-Sided and Two-Sided Hypotheses

Two-Sided Test:

One-Sided Tests:

Page 38: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Hypothesis Testing

4-3.4 General Procedure for Hypothesis Testing

4 and 5 determine 6

Page 39: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Case 1: Inference on the Mean of a Population, Variance Known

Assumptions

Page 40: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.1 Hypothesis Testing on the Mean

We wish to test:

The test statistic is:This statistic follows the standard normal distribution when H0 is true

Page 41: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.1 Hypothesis Testing on the Mean

Reject H0 if the observed value of the test statistic z0 is either:

z0 > z/2 or z0 < -z/2

Fail to reject H0 if

-z/2 < z0 < z/2

As ± z/2 are the critical values that define the borders of the critical (reject) region of the test statistic

Page 42: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.1 Hypothesis Testing on the Mean

Page 43: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.1 Hypothesis Testing on the Mean

Page 44: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Inference on the Mean of a Population, Variance Known

4-4.2 P-Values in Hypothesis Testing

Alternatively it is the probability that the test statistic will assume the calculated value z0 or greater when the null is true.

Page 45: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Inference on the Mean of a Population, Variance Known

4-4.2 P-Values in Hypothesis Testing

Page 46: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Inference on the Mean of a Population, Variance Known

4-4.3 Type II Error and Choice of Sample Size

Finding The Probability of Type II Error

Page 47: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Inference on the Mean of a Population, Variance Known

4-4.3 Type II Error and Choice of Sample Size

Finding The Probability of Type II Error

Page 48: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Inference on the Mean of a Population, Variance Known

4-4.3 Type II Error and Choice of Sample Size

Sample Size Formulas

Page 49: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.3 Type II Error and Choice of Sample Size

Page 50: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions
Page 51: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 3: A. Let 1 2, , , be a random sample fromnX X X

2 2 , . Since , , the standard error of

as an estimator of is . In most cases will be

unknown. The estimated statndard error is

N X N n

X n

ˆˆ .X

s

n n

B. For the data of Example 2, (thickness of copper foil)

ˆ ˆ5.04, 0.1657.x s

Page 52: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

0.1657ˆ 0.0428.

15X

s

n

Thus,

Recall that MINITAB displays this:

Descriptive StatisticsVariable N Mean Median TrMean StDev SE MeanTHICKNESS 15 5.0400 5.0500 5.0385 0.1657 0.0428

Variable Minimum Maximum Q1 Q3THICKNESS 4.7200 5.3800 4.9400 5.1200

ˆX

Page 53: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 4: The mean force exerted at a particular point in a newly designed mechanical assembly should be less than 4000 psi. Based on previous designs, the engineer “knows” that the force is normally distributed with standard deviation 65 psi. He will obtain data on 20 prototype assemblies to test

0

1

: 4000

: 4000

H

H

at the 0.01 level of significance.

A. He obtained 3982.8.x

Page 54: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Thus,

0

0.01

0 0.01 0

0

40001.18

65 20

2.326

Therefore z not in the critical region.

Do not reject .

xz

z

z z is

H

The engineer cannot (as he “hoped” to) conclude that the mean force is less than 4000 psi.

B. Do the problem on MINITAB, but note that you must have raw, not summarized, data.

The engineer collected the data on the next slide.

Page 55: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Stat > Basic Statistics > 1-Sample Z

MINITAB Output:Test of mu = 4000.0 vs mu < 4000.0The assumed sigma = 65.0

Variable N Mean StDev SE Mean Z PFORCE 20 3982.8 60.5 14.5 -1.18 0.12

P-value 0.01.Do not reject H0

Page 56: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.4 Large-Sample Test

In general, if n 30, the sample variance s2 will be close to 2 for most samples, and so s can be substituted for in the test procedures with little harmful effect. That is for large sample sizes:

approx

0,1X

NS n

Page 57: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

The results we have developed for the case of known variance are used for large sample size tests with unknown variance, except for two important differences.

1. Wherever you see a substitute the sample standard deviation s.

2. The confidence intervals and hypothesis tests become approximate rather than exact.

3. If the underlying distribution is not too non-normal, the approximation should be adequate if n 30.

We’ll do an example.

Page 58: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 5: A civil engineer working for NYSDOT tests 50 standard-sized specimens of concrete for moisture content after 5 hours of drying time. She collects the following data (appropriate units). She also constructs the histogram shown below.

11.715.329.113.139.19.414.911.211.225.0

26.431.020.612.419.616.95.215.817.821.1

19.641.739.322.733.123.423.019.319.317.2

5.159.45.712.127.76.516.921.932.020.9

18.610.45.714.211.715.315.219.511.413.6

19.38 10.51x s

0 12 24 36 48 60

0

10

20

MOISTURE

Fre

quen

cy

Page 59: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

The histogram indicates that distribution is quite skewed. She also does a normality test: in MINITAB do

Stat > Basic Statistics > Normality Test

Average: 19.384StDev: 10.5096N: 50

Anderson-Darling Normality TestA-Squared: 1.264P-Value: 0.002

10 20 30 40 50 60

.001

.01

.05

.20

.50

.80

.95

.99

.999

Pro

babi

lity

MOISTURE

Normal Probability Plot

Very low P-value

Page 60: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

She just wants a 95% confidence interval on the mean moisture content. Since the sample size is fairly large, an approximate 95% confidence interval is:

0.025

10.5119.38 1.96

50

19.38 2.91 16.47,22.29

sx z

n

You can do this on MINITAB by “making believe” that is known and s.

Page 61: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Some Practical Comments on Hypothesis Testing

Statistical versus Practical Significance

Page 62: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.5 Some Practical Comments on Hypothesis Testing

Statistical versus Practical Significance

Page 63: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Interval Estimation

The standard error provides a measure of precision of an estimate. In fact, very often in the engineering literature a point on a graph will have so-called “error bars” around it. The plotted point is actually the mean of a sample and the error bar is This is actually a special case of a confidence interval, formally defined on the next slide.

ˆ .Xx

Page 64: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

Two-sided confidence interval:

One-sided confidence intervals:

Confidence coefficient:

Page 65: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

Page 66: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

z

0.005 2.576

0.01 2.326

0.025 1.96

0.05 1.645

Page 67: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 6: A. Let 1 2, , , be a random sample fromnX X X

2 , and assume that is known. Since ,X

N Zn

2 2

2 2

2 2

1

1

1

XP z z

n

P z X zn n

P X z X zn n

L U

(11)

[by (10)]

Page 68: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Comparison of (9) and (11) implies that a 100(1 – )% confidence interval for is

2 2 2, .x z x z x zn n n

(12)

B. The output temperature of a gas turbine is An engineer has “tweaked” the design in order to increase it by at least 5%. The standard deviation will not be affected. A test of 16 turbines gives a sample mean of 678º. By (12), a 95% confidence interval for the (population) mean is

2625 C,22 .N

22678 1.96 678 10.78 667.2,688.8 .

16

Page 69: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

Page 70: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

Relationship between Tests of Hypotheses and Confidence Intervals

If [l,u] is a 100(1 - ) percent confidence interval for the parameter, then the test of significance level of the hypothesis

will lead to rejection of H0 if and only if the hypothesized value is not in the 100(1 - ) percent confidence interval [l, u].

Page 71: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

Confidence Level and Precision of Estimation

The length of the two-sided 95% confidence interval is

whereas the length of the two-sided 99% confidence interval is

Page 72: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

Choice of Sample Size

Page 73: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

Choice of Sample Size

Page 74: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 7: The tensile strength of a fiber used in the manufacture of insulation material is normally distributed with a standard deviation of 0.04gm. Having made some process adjustments, however, the mean has not been established. In a test of 40 fibers, the data shown on the next slide was collected. Find a 99% confidence interval for the population mean.

Page 75: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

9.9910.0510.029.9910.059.9610.009.9910.0310.0510.029.989.9610.0110.009.9010.049.999.939.99

9.979.9610.0010.079.959.969.979.9710.019.979.9910.0110.0110.019.959.999.979.959.9610.01

Tensile Strength (gms)

9.9908x

The 99% confidence interval is:

0.005

0.049.9908 (2.5758)

409.9908 (2.5758)0.00632

9.9908 0.0163 9.974,10.007

x zn

Let’s see how to do it using MINITAB.

Do Basic Statistics > 1-Sample Z

Page 76: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions
Page 77: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

The result of Example 7 is: “With 99% confidence, the mean tensile strength is between 9.974 and 10.007 grams.” However, for a variable like tensile strength, an engineer would perhaps like to say: “With 99% confidence, the mean tensile strength is at least (whatever).”

This brings us to the notion of one-sided confidence intervals, defined next.

Page 78: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Definition: Suppose is an unknown parameter. (i) Suppose L is a statistic based on a sample 1 2, , , . If nX X X

1 , P L (23a)

then the observed value of L, l say, is called a 100(1 – )% one-sided lower confidence limit or bound for .

(ii) Suppose U is a statistic based on a sample

1 2, , , . If nX X X

1 ,P U (23b)

then the observed value of U, u say, is called a 100(1 – )% one-sided upper confidence limit for .

Page 79: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

We said we would define one-sided confidence intervals, but in the definition, we refer to one-sided limits or bounds. Well – we really have defined intervals. Corresponding to the limits l and u in (23a,b) are the intervals (-, u) and ( l, ).

For the present case of sampling fromknown, one-sided lower and upper 100(1 )% confidence limits are respectively

2, with N

and .x z x zn n

(24)

Page 80: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 8: In Example 7, it would probably have been more appropriate to report a one-sided confidence limit.

Suppose there is a customer-mandated spec that specifies that the mean strength be at least 10. The lower 99% one-sided confidence limit for the mean tensile strength of this fiber is

0.019.9908 (0.00632) 9.9908 2.3263(0.00632) 9.976z

Remark: MINITAB Student doesn’t do one-sided limits directly. Can you figure out how to do it indirectly?

Page 81: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-4 Inference on the Mean of a Population, Variance Known

4-4.6 Confidence Interval on the Mean

One-Sided Confidence Bounds

Page 82: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Problem 4-27• In the production of airbag inflators for

automotive safety systems, a company is interested in ensuring that the mean distance of the foil to the edge of the inflator is at least 2 cm. Measurements on 20 inflators yielded an average value of 2.02 cm. Assume a standard deviation of .05 on the distance measurements and a significance level of .01.

a) Test for conformance to the company’s requirement.

Page 83: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Airbag Inflator and Assembly

Page 84: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Problem 4-27b) What is the P-value of this test?c) What is the β of this test if the true mean is

1.97 cm.?d) What sample size would be necessary to

detect a true mean of 1.97 with probability of at least .90?

e) Find a 99% lower confidence bound on the true mean.

f) Use the CI found in part e to test the hypothesis in part a.

g) What sample size is required to be 95% confident that the error in estimating the mean is less than .01

Page 85: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Let be a random sample fromwith ² unknown.

This case differs from Case I in that now we shall consider the more realistic situation in which the variance is unknown. The primary reason for considering Case I at all is that it is a convenient way of illustrating the basic concepts of confidence interval estimation and hypothesis testing.

1 2, , , nX X X 2, ,N

4.4 Case II: Inference on the Mean of a Population, Variance Unknown

Page 86: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

When sampling from the normal distribution the exact confidence interval for is based solely on the fact that

0,1X

Nn

and for the exact hypothesis test, under H0

0 0,1 .X

Nn

(27a)

(27b)

Page 87: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Although (27a,b) are true whether or not the variance is known, the problem is that if is not known this knowledge does us no good with respect to confidence intervals and hypothesis tests for the mean.

Intuitively, what you don’t know, you estimate. In particular, suppose in (27a), we estimate the population standard deviation by the sample standard deviation. This leads to the question

?X

TS n

Page 88: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

The pdf of the random variable T defined in (28) has been derived [see eqn. (4-40), pg. 163, text)]. The distribution is called (Student’s) t distribution. The parameter of a t distribution is it’s degrees of freedom, say k.

In particular, the random variable defined by (28) has n – 1 degrees of freedom (write

Thus, for each sample size, T has a different pdf.

Intuitively, it would seem that the pdfs of the t distributions should look like standard normal pdfs. They do, but there is a difference. Some representative t pdfs are shown on the next slide.

1).nT t

Page 89: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Pdfs of several t distributions, where k denotes the degrees of freedom. They look much like standard normal pdfs, but the tails are higher (more variance), but note that the t pdf approaches that of the standard normal as .k

Higher tails than pdf of standard normal

Page 90: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Percentage points of t distributions are defined similarly to those of the standard normal as shown in the figure below.

Percentage points of the t distribution, where k denotes the degrees of freedom. To obtain percentage points, use MINITAB or Table II in Appendix A. We’ll demonstrate.

Page 91: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 92: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 93: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 94: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Example 4-7• A sample of size 15 was taken on the

coefficient of restitution of golf club drivers.

The variance of the coefficient is unknown

• Does the mean coefficient of restitution exceed .82

Page 95: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.1 Hypothesis Testing on the Mean

Page 96: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.2 P-value for a t-Test

Page 97: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

We need to know what is the distribution of the test statistic

00

XT

S n

under H1?

The answer is: T0 has a so-called non-central t-distribution. The probability for making a type II error, and the corresponding power of the test can be illustrated for hypothesis tests having an unknown variance in an identical fashion as was done for tests having a known variance. However, numerically this is a cumbersome computational procedure, an we will find β values and the power of a test using Minitab only.

- P( Type II Error)

Page 98: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

β - P( Type II Error)Example 4-8

If the mean differs from .82 by as much as .02, is the sample size of 15 adequate to ensure the null hypothesis will be rejected with probability of at least .80

• Stat>Power and Sample>1 sample t

Calculate power for each sample size

Specify a sample size, difference, sigma, click on options and insert the type of alternative hypothesis and level of significance.

Page 99: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Sample size

• The sample size required to detect a difference from the hypothesized mean when the variance of the characteristic of interest is unknown, with a specified power will be determined using Minitab.

Example 4-8• Stat>Power and Sample>1 sample t

Calculate sample size for each power value

Specify a power, difference, sigma, click on options and insert the type of alternative hypothesis and level of significance.

Page 100: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.3 Type II Error and Choice of Sample Size

Fortunately, this unpleasant task has already been done, and the results are summarized in a series of graphs in Appendix A Charts Va, Vb, Vc, and Vd that plot for the t-test against a parameter for various sample sizes n.

Page 101: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.3 Type II Error and Choice of Sample Size

These graphics are called operating characteristic (or OC) curves. Curves are provided for two-sided alternatives on Charts Va and Vb. The abscissa scale factor d on these charts is defined as

Page 102: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Confidence Intervals for the MeanWe can derive a confidence interval for the t-distributed test statistic in a manner similar to when the test statistic follows the standard normal distribution. We can write:

2, 1 2, 1

2, 1 2, 1

2, 1 2, 1

1

1

1 .

n n

n n

n n

XP t t

S n

P t S n X t S n

P X t S n X t S n

By the definition of confidence intervals (see (9)), we see from (29) that

(29)1where ~

n

XT t

S n

Page 103: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

A 100(1 – )% confidence interval for the mean of a normal distribution, variance unknown, is

2, 1 2, 1 2, 1, .n n n

s s sx t x t x t

n n n

(30)

To obtain one-sided lower (upper) 100(1 – )% limits, in (30), change /2 to and take the lower (upper) limit.

Example 9: Suppose in Example 7 neither the mean nor the standard deviation is known.

A. Use the data given in Example 7 to find a 99% confidence interval for the mean.

Page 104: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

In Example 7,confidence interval when the variance is unknown is thus

9.9908 and 0.03540. The 99%x s

0.005,39

0.035409.9908 (2.7079)

409.9908 (2.7079)0.0055972

9.9908 0.015156 9.976,10.006

sx t

n

Recall when the variance was known in ex. 7, the CI was

B. Now use MINITAB.

Go to: Stat>Basic Statistics > 1-Sample t

The results appear on the next slide.

9.9908 0.0163 9.974,10.007

Page 105: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions
Page 106: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

Recall that in the case where the variance is known, sample size for a 100(1 )% confidence interval is given by (26), repeated below

2

2 .

zn

E

(26)

Sample Size Determination for Confidence Intervals (Variance Unknown)

In the present case with the variance unknown, the best that can be done is to use (26) with your best guess of . You can also take a pilot sample and estimate .

Page 107: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.4 Confidence Interval on the Mean

Page 108: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.4 Confidence Interval on the Mean

Page 109: Chapter 4 Modeling and Analysis of Uncertainty. Statistical Inference Statistical inference is using the information contained in a sample to make decisions

4-5 Inference on the Mean of a Population, Variance Unknown

4-5.4 Confidence Interval on the Mean