6.3 One- and Two- Sample Inferences for Means

If is known a 95% Confidence Interval is

1.96 1.96 xx x SEn

But is never “known”.

If σis unknown• Estimate σby sample standard deviation s• The estimated standard error of the mean will be

• Using the estimated standard error we have a confidence interval of

• The multiplier needs to be bigger than Z (e.g., 1.96 for 95% CI). The confidence interval needs to be wider to take into account the added uncertainty in using s to estimate s.

• The correct multipliers were figured out by a Guinness Brewery worker.

nsSE /

)____(n

What is the correct multiplier? “t”

• 100(1-a)% confidence interval when s is unknown

• 95% CI =100(1-0.05)% confidence interval when s is unknown

( / )x t s n

0.975 ( / )x t s n

Properties of t distribution

• The value of t depends on how much information we have about s. The amount of information we have about s depends on the sample size.

• The information is “degrees of freedom” and for a sample from one normal population this will be: df=n-1.

t curve and z curve

Both the standard normal curve N(0,1) (the z distribution), and all t(v) distributions are density curves, symmetric about a mean of 0, but t distributions have more probability in the tails.

As the sample size increases, this decreases and the t distribution more closely approximates the z distribution. By n = 1000 they are virtually indistinguishable from one another.

Quantiles of t distribution

t table is given in the book: Table B.4

It depends on the degrees of freedom as welldf probability t5 0.90 1.47610 0.95 1.812 20 0.99 2.52825 0.975 2.060 ∞ 0.975 1.96

Confidence interval for the mean when s is unknown

s sx t x tn n

Example

• Noise level, n=12 74.0 78.6 76.8 75.5 73.8 75.6 77.3 75.8 73.9 70.2 81.0 73.9 1. Point estimate for the average noise level of

vacuum cleaners;2. 95% Confidence interval

Solution

• n=12, • Critical value with df=11

• 95% CI:

53.75x 75.2s

0.975 2.201t

75.153.751275.2201.253.75

28.7778.73

Example 8 (page 366) Failure times of 10 springs. The normal plot looks fairly straight. (If not, try transforming or a different distribution, e.g. Weibull)

168.333.1

33.1 10.4710

90% CI: 168.3 1.833*10.47 168.3 19.2 149.1 to 187.5

If we were to test Ho: 150 vs Ha: 150 , we would not reject H0, since 150 is in the confidence interval for .

To test #: oH Analogous to the large sample test with z test statistic

We would have #xT s

Determination of reject / don’t reject Ho as well as p-values are found use T-table with 1 ndf

We could do the test using:

33.1 10.4710

168.3 150 18.3 1.7410.47 10.47

1.383 1.74 1.833 (0.9) test statistic (.95)0.05 ( 1.74) 0.1

( 1.74)*20.1 0.2

Q QP t

p value P tp value

But the confidence interval is more informative.

On the other hand

• t0.95,9 =1.833, t0.05,9 =-1.83• If we have a test statistic value that is either

too small (<-1.83) or too big (>1.83), then we have strong evidence against H0.

• t=1.74 which is not too small or too big (compared to the cutoff values above/ ”critical values”) , then we cannot reject the null.

Another method: Rejection Regions

Alternative Hypotheses

> 0 < 0 0

Rejection Regions

-----------------

---------------

z>z/2 orz<-z/2

------------------

t>t/2 ort<-t/2

Rejection Region method and p-value method

• For Ha: <0, if z test statistic is less than -1.645, then the p-value is less than 0.05. Comparing the p-value to 0.05 is the same as comparing the z value to -1.645.

• For t tests we can also find some critical values corresponding to level of that we can compare to our test statistic.

• Test statistic in the rejection region is the same as p-value is less than .

Paired Data

T=top water zinc concentration (mg/L)B=bottom water zinc (mg/L)

1 2 3 4 5 6Top 0.415 0.238 0.390 0.410 0.605 0.609Bottom 0.430 0.266 0.567 0.531 0.707 0.716

1982 study of trace metals in South Indian River. 6 random locations

6.3.2 Paired Mean DifferenceTo compare Top & Bottom Water Zinc from a River Location Bottom Top 1 0.430 0.415 2 0.266 0.238 3 0.567 0.390 4 0.531 0.410 5 0.707 0.605 6 0.716 0.609 That is equivalent to ask: is it true that difference>0?

This is a special case of the mean of a single column of numbers . Create a new column for the difference between 2 variables. Top & Bottom Water Zinc from a River Location Bottom Top Difference = d 1 0.430 0.415 0.015 2 0.266 0.238 0.028 3 0.567 0.390 0.177 4 0.531 0.410 0.121 5 0.707 0.605 0.102 6 0.716 0.609 0.107

061.0 092.0 dsd

Check normality Ordered id 0.015 0.028 0.102 0.107 0.121 0.177 Z Quantiles -1.38 -0.67 -0.21 0.21 0.67 1.38

Zinc in River

-1.5-1

0 0.05 0.1 0.15 0.2

Series1

Note: Even normal plots from random normal data are not perfectly straight

n = 6 id values 516 df 95% Confidence Interval t = 2.571

0.156 to028.0064.0095.0

)025.0(571.2092.0

025.0 061.0s 092.0 d

sSEd dd

By the usual hypothesis testing perspective, the p-value for 0: vs0: HaHo d is less than 0.05, since 0d is not in a 95%

confidence interval. Our results would be “statistically significant” evidence against 0: doH .

d0.092 s 0.061 0.025

0.092 0 3.680.025

| | 2.5712*0.005 2*0.010.01 0.02

sd SEn

In hypothesis testing notation the p-value must be less than to claim statistical significance. A test is significant at the level if the Ho value is not in the 100(1- )% confidence interval. What about = 0.01 level? p-value > 0.01 Don’t reject H0

level 0.01at t significanlly statisticaNot 0.192 to0.008-0.1000.092

0.1000.092 )025.0(032.4092.0

Assumptions: The population of differences follows a normal distribution. A normal plot of differences, d’s, should be fairly straight. Note: We don’t need B or T to be normal.

Tell when to reject H0: μ = 120 using a t-test. Answers would be of the form

reject H0 when t < -1.746 or maybe reject H0 when |t| > 1.746 or maybe reject H0 when t > 1.746

(a) HA: μ < 120, α = 0.05, n=20

(b) HA: μ > 120, α = 0.10, n=18

(c) HA: μ ≠ 120, α = 0.01, n=9

Rejection region exercise

Answers would be of the form 0.10 < p < 0.05 p < 0.001 p > 0.8 After finding the p-value in each case, tell whether to reject or not reject H0 at the α = 0.05 level.

(a) HA: μ > 120, n=7, t = -2.58

(b) HA: μ < 120, n=7, t = -2.58

(c) HA: μ ≠ 120, n=7, t = -2.58

Find p-value exercise

6.3.3 Large Sample Comparisons of Two Means

Glue 1: 211 ,

Glue 2: 222 , Both populations normal

1n independent values for glue 1 2n independent values for glue 2

Not paired, blocked, …

21 xx is our guess at 21

How much might 21 xx deviate from 21 ?

121 )()1()()(nn

xVarxVarxxVar

Experiment 1x 2x 21 xx 1 10.1 11.2 -1.1 2 11.4 10.6 0.8 3 12.2 10.4 1.8 … ∞ … … …

Mean 1 2 21

Variance 1

21SE 2

2SE 22

21 SESE

A confidence interval for 21 is given by

2 22 21 2

1 2 1 2 1 21 2

x x z x x z SE SEn n

And for hypothesis testing1 2 1 2

2 21 2

( ) x xz

But we never really know

6.3.4 Small Sample Comparisons of Two Means

The confidence intervals need to be widened to account for additional uncertainty in 2

1s and 22s as estimators of 2

1 and 22 .

Case 1: Assume equal variances. 22

Case 2: Don’t assume variances are necessarily equal. But they may be.

Case 1: Both 21s and

21s are estimators of

Pool 21s and

21s into a pooled, combined estimate of

2ps = weighted average of

21s and

21s , weight by df.

)1()1()1()1(

snsnsp

nntsxx

nnstxx

For the tabled value of t, use df = (n1-1) + (n2-1)

To test #: 21 oH , check if # in confidence interval or use

And compare to T-table with df = (n1-1) + (n2-1) .

Case 2: Don’t assume 2 2 21 2

2 21 21 2 1 2 1 2

or s sx x t x x t SE SEn n

1 24 41 2

2 21 1 2 2( 1) ( 1)

s sn n

s sn n n n

2 21 2

2 21 2 1 2

1 1: With n, o pNote n n nly df change SE SE sn n

Lifetime of Springs. Table 6.7 and Figure 6.15

Springs Figure 6.15

-2.000

-1.500

-1.000

-0.500

100 150 200 250 300

Lifetime

900 Stress

950 Stress

900 Stress 950 Stress 216 162 225 171 153 216 198 189 225 216 189 135 306 225 162 135 243 189 117 162

10 10215.5 168.342.9 33.1 13.57 10.47

n nx xs sSE SE

Note: Usually lifetimes are more lognormal than normal. To follow the book’s example, carry on in time scale.

Case 1: Assume 2 2 21 2

1.17101

1011468

1.339.4299

)1.33(9)9.42(9

215.1-168.3 ± 2.101(17.1) 46.8 ± 36.0 10.8 to 82.8 Based on the confidence interval, we reject 0: 21 oH vs 0: 21 oH at = 0.05 level.

Alternatively, to test 0: 21 oH vs 0: 21 oH

215.1 168.6 2.7 9 9 1817.1

2 0.005 2 0.010.01 0.02

To test 0: 21 oH vs 0: 21 oH , 0.005 < p < 0.01. To test 0: 21 oH vs 1 2: 0oH , 0.99 < p < 0.995.

Case 2: Not assuming 221

2 21 2

16.9 17

: With n, o

Note n n nly df change

SE SE sn n

46.8 ± 2.110(17.1) Wider CI

6.3 One- and Two- Sample Inferences for Means

Documents

Inferences & Interpretations

6.3 One- and Two- Sample Inferences for Means. If is unknown Estimate by sample standard deviation s The estimated standard error of the mean will

Rifting in heterogeneous lithosphere: Inferences from ... · PDF fileRifting in heterogeneous lithosphere: Inferences from numerical ... heterogeneous lithosphere: Inferences from

Chapter 22 – Inferences About Means · Chapter 22 – Inferences About Means 1. Salmon. a) The shipment of 4 salmon has 2 1pound. 4 SD y n The shipment of 16 salmon has 2 0.5pounds

Chapter 6.3 The central limit theorem. Sampling distribution of sample means A sampling distribution of sample means is a distribution using the means

Section 9.4 Inferences About Two Means (Matched Pairs)

Hypothesis Testing Sample Means. Hypothesis Testing for Sample Means The goal of a hypothesis test is to make inferences regarding unknown population

Rational Inferences and Bayesian Inferences

SYLLOGISM - CA-FOUNDATION.inSyllogism is a ‘Greek’ word that means inference or deduction. As such inferences are based on logic, then these inferences are called logical deduction

Chapter 10 Inferences on Two Samples 10.1 Inference about Means: Dependent Sampling

6.3 Two-Sample Inference for Means

Learning About the World - Chandler Unified School District...Chapter 23 Inferences About Means Chapter 24 Comparing Means Chapter 25 Paired Samples and Blocks 529 Learning About the

· PDF file23.a - Inferences about Means 1) ... Sampling Distribution of Means we have the ... Read Chapter 23, Pg. 530 - 546 2) Pg. 554,

Copyright © 2009 Pearson Education, Inc. Chapter 23 Inferences About Means

1 Comparing Two Means - us.sagepub.com€¦ · 1 Comparing Two Means Introduction Analysis of variance (ANOVA) is the standard method used to generate confident statistical inferences

Objective: To test claims about inferences for two sample means, under specific conditions

Inferences PowerPoint

CHAPTER 9 9-3: INFERENCES ABOUT TWO MEANS … · CHSPROBANDSTATS 1 CHAPTER 9 CHAPTER 9 9-3: INFERENCES ABOUT TWO MEANS INDEPENDENT SAMPLES Goal: Tobe able to use two sample MEANSfor

Inferences About Means of Single Samples Chapter 10 Homework: 1-6

The t-test Inferences about Population Means when population SD is unknown