Lecture Hypothesis QM v2 - University of Haifa · Find the critical value and rejection region for...

---לא מסווג--- ---לא מסווג---

Quantitative Methods2013

Hypothesis Testing with One Sample

---לא מסווג---

Concept of Hypothesis Testing

Testing Hypotheses is another way to deal with the problem of making a statement about an unknown population parameter, based on a random sample.

Instead of finding an estimate for the parameter, we can often find it convenient to hypothesize a value for it and then use the information from the sample to confirm or refute the hypothesized value.

A hypothesis is a judgment about the population parameter based simply on an assumption or intuition with no concrete backup information or analysis.

For example:‐ population mean

The mean monthly cell phone bill in Haifa is

μ ≤ $42

Hypothesis testing

Hypothesis testing is the comparison of the analyst's belief or claim about a population parameter to the corresponding sample statistic and deciding whether or not the belief or claim about the population parameter is correct.

Example: Filling of soft drink bottles in a bottling plant.

Managers have the belief that the machine is filling 1L bottles with 1L of the soft drink.

This belief is tested by drawing a sample of filled bottles, measuring the amount of soft drink in each, finding the mean and variance of the amount of soft drink in sample of bottles, comparing this amount with the 1L belief,

and finally deciding whether the information from the sample supports or refutes the 1L belief.

Stating a Hypothesis

The Null Hypothesis, H0

This is the hypothesis or claim that is initially assumed to be true.

It is what is believed to be, presumed to be, or accepted as true and correct without any evidentiary support.

In the bottling scenario, managers believe the machine is filling the bottles correctly even though they have no evidence to support that belief.

The null hypothesis in that case is that the bottling machine is functioning correctly.

The Null Hypothesis, H0

Example: The average number of TV sets in U.S. homes is equal to three ( )

Always contains “=” , “≤” or “≥” sign.

Is always about a population parameter, not about a sample statistic

3µ:H0 = 3X:H0 =

3µ:H0 =

The Alternative Hypothesis, H1 , (Ha)

This is the hypothesis or claim which we initially assume to be false but which we may decide to accept if there is sufficient evidence.

In the bottling scenario, managers will not believe the machine is filling the bottles incorrectly unless they have evidence to support that belief.

It is the sample which provides this evidence.The alternative hypothesis in that case is that the bottling machine is not functioning correctly.

The Alternative Hypothesis, H1 , (Ha)

Is the opposite (the complement) of the null hypothesis

e.g., The average number of TV sets in U.S. homes is not equal to 3 ( H1: μ ≠ 3 )

Never contains the “=” , “≤” or “≥” sign

This procedure is familiar to us already from the legal system:

“Innocent until proven guilty”.

The Null Hypothesis (Innocent) is onlyrejected in favor of the Alternative Hypothesis (Guilty)if there is sufficient evidence of this.

Example:

A weight loss company claims its product, when used as directed, will lead to at least 10 kilograms of weight loss within 45 days.

A consumer protection agency wants to verify the company's claim in order to protect the public from fraud.

What is the hypothesis statement the agency director should use?

H0:Ha: µ < 10kg

µ ≥ 10kg

Complement of the null hypothesis

Since the consumer protection agency wants evidence that the weight loss product does not performs as claimed, yields less than 10 kg of weight loss.

The entire hypothesis statement is:

Rejection Regions

One way of performing a Hypothesis test is tocompute a rejection region.

If we find that our sample statistic is in this region then we reject our Null Hypothesis.

What is important is the idea that the rejection region is a region far away from our Null hypothesis.

And that it is unlikely that we would observe a sample with a value of the sample statistic (for example the sample mean) this far away from the Null Hypothesised value if that Null Hypothesis was true.

For example in a class of 100 people.

Suppose our Null Hypothesis is that the average age of the whole class is 31.

Ho: thepopulationmean age is 31.

H0: µ = 31

Ha: µ ≠ 31

However, suppose we now observe a value of 20 in a sample we have randomly chosen. That’s very unlikely to have happened by chance if our Null Hypothesis was true. It’s much more likely that our Null Hypothesis is false.

If we took a sample from our population and the average age is 33 year we might say that average age of the population is not significantly different from 35.In this case we would accept the Null Hypothesis as being correct.

Random sample

Supposethe samplemean age is 20: X = 20

20likely if μ = 31?

REJECT

Null Hypothesis

If not likely,

So we decide to reject our Null Hypothesis in favourof an alternative, which is that the true average age of the whole class is different from 31 years.

Sampling Distribution of X

μ = 31If H0 is true

we reject the null hypothesis that μ = 31 if sample mean falls in either of these regions.

If the sample mean falls in this region we accept the null hypothesis

Rejection Regions and Critical Values

A rejection region (or critical region) of the sampling distribution is the range of values for which the null hypothesis is not probable.

A critical value separates the rejection region from the nonrejection region.

Level of Significance, α

defines the unlikely values of the sample statistic if the

null hypothesis is true.

– Defines rejection region of the sampling distribution

Typical values are 0.01, 0.05, or 0.10

Level of Significance, α

is selected by the researcher at the beginning

By setting the level of significance at a small value, you are saying that you want the probability of rejecting a true null hypothesis to be small.

Provides the critical value(s) of the test.

There are three types of hypothesis tests –a left-, right-, or two-tailed test.

The type of test depends on the region of the sampling distribution that favors a rejection of H0. This region is indicated by the alternative hypothesis.

/2α/2α

Left‐tailed Test

1. If the alternative hypothesis contains the less‐than inequality symbol (<), the hypothesis test is a left‐tailed test.

Level of significance = αRepresents critical value.

H0: μ ≥ k

H1: μ < k0

2. If the alternative hypothesis contains the greater‐than symbol (>), the hypothesis test is a right‐tailed test.

Right-tailed Test

H0: μ ≤ k

Ha: μ > k

/2α/2α

Two-tailed Test

3. If the alternative hypothesis contains the not‐equal‐to symbol (≠), the hypothesis test is a two‐tailed test. In a two‐tailed test, each tail has an area of α/2.

H0: μ = k

H1: μ ≠ k

Statistical Tests

The statistic that is compared with the parameter in the null hypothesis is called the test statistic.

After stating the null and alternative hypotheses and specifying the level of significance,a random sample is taken from the population and sample statistics are calculated.

A test statistic tells us how far, or how many standard deviations, a sample mean is from the population mean. The larger the value of the test statistic, the further the distance, or number of standard deviations, a sample mean is from the population mean stated in the null hypothesis.

If the population standard deviation is known, and the population is (approx.) Normally distributed then the test statistic, is,

−= x µz σn

μ is the hypothesized population mean.

X is the sample mean.

σx is the population standard deviation.

n is the sample size.

is the standard error.σn

z, is how many standard errors, the observed sample mean is

from the hypothesized mean.

If the population standard deviation is unknown then the only standard deviation we can determine is the sample standard deviation, s.

This value of s can be considered an estimateof the population standard deviation.

If the sample size is more than 30 we can useNormal Distribution.

Central Limit Theorem

−= x µz sn

Example:Find the critical value and rejection region for a righttailed test with α = 0.5%.

z0 2.575

The rejection region is to the right of z0 = 2.575.

Decision Rule Based on Rejection RegionTo use a rejection region to conduct a hypothesis test, calculate the test statistic, z.

If the test statistic1. is in the rejection region, then reject H0.2. is not in the rejection region, then fail to reject H0.

Reject Ho.

Fail to reject Ho.

z > z0

Right-Tailed Test

0−z0

Two-Tailed Test

z0z < z0 z > z0

Reject Ho.

Fail to reject Ho.

Reject Ho.

Fail to reject Ho.

Reject Ho.

Left-Tailed Test

z < z0

Hypothesis Testing Example

Test the claim that the true mean # of TV sets in US homes is equal to 3.

(Assume σ = 0.8)

1. State the appropriate null and alternativehypotheses

H0: μ = 3 H1: μ ≠ 3 (This is a two‐tail test)

2. Specify the desired level of significance and the

sample size

Suppose that α = 0.05 and n = 100 are chosen for this test

3. Determine the appropriate techniqueσ is known, so this is a Z test.

4. Determine the critical valuesAt a significance level of 5% for the test of adifference there is 2.5% in each tail.

Using function NORMSINV in Excel this gives acritical value of z of 1.96.

5. Collect the data and compute the test statistic

Suppose the sample results are

n = 100, X = 2.84 (σ = 0.8 is assumed known)

So the test statistic is:

2.0.08

0.832.84

Z −=−

6. Is the test statistic in the rejection region?

Reject H0 Do not reject H0

α = 0.05/2

‐Z= ‐1.96 0Reject H0 if

Z < ‐1.96 or Z > 1.96;

otherwise do not reject H0

α = 0.05/2

Reject H0

+Z= +1.96

Here, Z = ‐2.0 < ‐1.96, so the test statistic is in the rejection region

Since Z = ‐2.0 < ‐1.96, we reject the null hypothesis and conclude that there is sufficient evidence that the mean number of TVs in US homes is not equal to 3.

For X = 2.84, σ = 0.8 and n = 100, the 95% confidence interval is:

[2.6832, 2.9968]

Since this interval does not contain the hypothesized mean (3.0), we reject the null hypothesis at α = 0.05

Connection to Confidence Intervals

0.8 (1.96) 2.84 to

0.8 (1.96) - 2.84 +

Example:

A local telephone company believes that the average length of a phone call is 8 minutes.

In a random sample of 58 phone calls, the sample mean was 7.8 minutes and the standard deviation was 0.5 minutes.

Is there enough evidence to reject the Null atα = 0.05?

Ha: µ ≠ 8H0: µ = 8

−z0 = −1.96

0 z0 = 1.96

0.025 0.025

At a significance level of 5% for the test of adifference there is 2.5% in each tail.

Using function NORMSINV in Excel this gives acritical value of z of 1.96.

−z0 = −1.96

0 z0 = 1.96

0.025 0.025

The standardized test statistic is

7.8 80.5 58

−=x µzσ n

3.05.≈ −

The test statistic falls in the rejection region, so H0 is rejected.

At the 5% level of significance, there is enough evidence to reject the claim that the average length of a phone call is 8 minutes.

Example: Right‐Tail Z Test for Mean

A phone industry manager thinks that customer monthly cell phone bills have increased, and now average over $52 per month. The company wishes to test this claim. (Assume σ = 10 is known)

Form hypothesis test:

H0: μ ≤ 52 the average is not over $52 per month

H1: μ > 52 the average is greater than $52 per month(i.e., sufficient evidence exists to support the manager’s claim)

Find Rejection Region

Suppose that α = 0.10 is chosen for this test

α = 0.10

For a one‐tail test, at a significance level of 10% for the test there is 10% in the right tail. The area of the curve for the upper level is 100% ‐ 10% or 90.00%. Using [function NORMSINV] in Excel this gives

a critical value of z of 1.28.

Reject H0Do not reject H0

α = 0.10

Reject H0

Reject H0 if Z > 1.28

Find Rejection Region

Obtain sample and compute the test statistic

Suppose a sample is taken with the following results:

n = 64, X = 53.1 (σ=10 was assumed known)

– Then the test statistic is:

X µ 53.1 52Z 0.88

− −= = =

Reject H0Do not reject H0

α = 0.10

Reject H0

Do not reject H0 since Z = 0.88 ≤ 1.28

i.e.: there is not sufficient evidence that themean bill is over $52

Z = 0.88

Reach a decision and interpret the result:

t-Test for a Mean µ (n < 30, σ Unknown)

The t-test can be used when the population is normal or nearly normal, σ is unknown, and n < 30.

x µts n

The degrees of freedom are d.f. = n – 1 .

Example:

Find the critical values t0 and −t0 for a two‐tailed test given α = 0.10 and n = 12.

The degrees of freedom are d.f. = n – 1 = 12 – 1 = 11.

Because the test is a two‐tail test, one critical value is negative and one is positive.

−t0 = − 1.796 and t0 = 1.796

Example:

Find the critical value t0 for a right‐tailed test givenα = 0.05 and n = 11.

The degrees of freedom are d.f. = n – 1 = 11 – 1 = 10.

Because the test is a right‐tail test, the critical value is positive.

t0 = 1.81

Using function TINV from Excel the t‐value is 1.81.

A one‐tailed t‐value can be returned by replacing probability with 2*probability.

For a probability of 0.05 and degrees of freedom of 10, the two‐tailed value is calculated with TINV(0.05,10), which returns 2.28139.

The one‐tailed value for the same probability and degrees of freedom can be calculated with TINV(2*0.05,10), which returns 1.812462.

TINV function returns the two‐tailed inverse of the t‐distribution.

The average cost of a hotel room in Tel Aviv is said to be $168 per night.

A random sample of 25 hotels resulted in X = $172.50 and S = $15.40.

Test at the α = 0.05 level.

(Assume the population distribution is normal)

H0: μ = 168

H1: μ ≠ 168

Example:

• α = 0.05

• n = 25

• σ is unknown, n<30 so use a t statistic

• Critical Value: t24 = ± 2.0639

Reject H0Reject H0

α/2=.025

‐t n‐1,α/2Do not reject H0

α/2=.025

‐2.0639 2.0639

t n‐1,α/2

15.40168172.50

t 1n =−

Do not reject H0: not sufficient evidence that true mean cost is different than $168

Reject H0Reject H0

α/2=.025

‐t n‐1,α/2Do not reject H0

α/2=.025

‐2.0639 2.0639

t n‐1,α/2

For X = 172.5, S = 15.40 and n = 25, the 95% confidence interval is:

172.5 ‐ (2.0639) 15.4/ 25 to 172.5 + (2.0639) 15.4/ 25

[166.14 ,178.86]

Since this interval contains the Hypothesized mean (168), we do not reject the null hypothesis at α = 0.05

Connection to Confidence Intervals

A certain country has made its budget on the bases that the average individual average tax payments for the year will be $30,000.

The financial controller takes a random sample of annual tax returns and these amounts in dollars are as follows.

Example:

34,000 12,000 16,000 10,0002,000 39,000 7,000 72,00024,000 15,000 19,000 12,00023,000 14,000 6,000 43,000

At a significance level, α, of 5% is there evidencethat the average tax returns of the state will bedifferent than the budget level of $30,000 inthis year?

The null and alternative hypotheses are as follows:

Null hypothesis: H0: μ = $30,000.Alternative hypothesis: H1: μ ≠ $30,000.

Since we have no information of the populationstandard deviation, and the sample size isless than 30, we use a t distribution (assuming the population is normally distributed).

Sample size, n, is 16.Degrees of freedom, (n‐1) are 15.

Using [function TINV] from Excel the t value is 2.1315. Therefore, ±2.1315 are the critical values.

From Excel, using [function AVERAGE]:Mean value of this sample data is $21,750.00.

From [function STDEV] in Excel, the samplestandard deviation, s, is $17,815.72

Estimate of the standard error is,

−= = -1.8523x µts n

= =17,815.72 4,453.9316

From equation the test statistic is,

Since the sample statistic, 1.8523, is not less than the test statistic of 2.1315,there is no reason to reject the null hypothesis and sowe accept that there is no evidence that theaverage of all the tax receipts will be significantlydifferent from $30,000.

Example:A local telephone company claims that the average length of a phone call is 8 minutes. In a random sample of 18 phone calls, the sample mean was 7.8 minutes and the standard deviation was 0.5 minutes. Is there enough evidence to reject this claim at α = 0.05?

Ha: µ ≠ 8H0: µ = 8

The level of significance is α = 0.05.

The test is a two-tailed test.

Degrees of freedom are d.f. = 18 – 1 = 17.

The critical values are −t0 = −2.110 and t0 = 2.110

−t0 = −2.110

0 t0 = 2.110

The standardized test statistic is

7.8 80.5 18

−=x µts n

1.70.≈ −

The test statistic falls in the nonrejection region, so H0 is not rejected.

At the 5% level of significance, there is not enough evidence to reject the claim that the average length of a phone call is 8 minutes.

6 Steps in Hypothesis Testing

1. State the null hypothesis, H0 and the alternative hypothesis, H1

2. Choose the level of significance, α, and the sample size, n

3. Determine the appropriate test statistic and sampling distribution

4. Determine the critical values that divide the rejection and nonrejection regions

5. Collect data and compute the value of the test statistic.

6. Make the statistical decision. If the test statistic falls into the nonrejection region, do not reject the null hypothesis H0. If the test statistic falls into the rejection region, reject the null hypothesis.

6 Steps in Hypothesis Testing

APPENDIX

Types of Errors

No matter which hypothesis represents the claim,always begin the hypothesis test assuming that the null hypothesis is true.

At the end of the test, one of two decisions will be made:

1. reject the null hypothesis, or

2. fail to reject the null hypothesis.

A Type I error occurs if the null hypothesis is rejected when it is true.

A Type II error occurs if the null hypothesis is not rejected when it is false.

Actual Truth of H0

H0 is true H0 is false

Do not reject H0

Reject H0

Correct Decision

Type II Error

Type I Error

Decision

Actual Truth of H0

H0 is true H0 is false

Do not reject H0

Reject H0

Correct Decision

Type II Error

Type I Error

Decision

The probability of Type I Error is α

Example:The Haifa Technion claims that 94% of their graduates find employment within six months of graduation. What will a type I or type II error be?

H0:H1: p ≠ 0.94

p = 0.94 (Claim)

A Type I error is rejecting the null when it is true. The population proportion is actually 0.94, but is rejected. (We believe it is not 0.94.)

A Type II error is failing to reject the null when it is false. The population proportion is not 0.94, but is not rejected. (We believe it is 0.94.)

Lecture Hypothesis QM v2 - University of Haifa · Find the critical value and rejection region for...

Documents

Nitrogen Rejection (Chapter 13) Trace-Component Recovery ......Nitrogen Rejection (NRU) Nitrogen Rejection for Gas Upgrading Nitrogen Rejection for EOR Trace-Component Recovery or

Internal rejection

Appearance Rejection Sensitivity

STAFF SELECTON COMMISSION (SOUTHERN REGION) CHENNAIsscsr.gov.in/STENO2016-REJECTION-LIST.pdf · 137 60001707922 NALATHOTI ASHOKA CHAKRAVARTHI 30071993 SC 8001002045 MALE Photo not

Graft Rejection

Transplant Rejection

Corneal Allograft Rejection

Qatar ‘rises above’ its region: Geopolitics and the rejection of the GCC gas market

FOXP3 mRNA profile prognostic of T cell mediated rejection ... · 28.04.2020 · T cell mediated rejection (TCMR) is the most frequent type of acute rejection (1-4). Anti-rejection

Rejection Therapy

Non Final Rejection

Band Rejection

Selection and rejection

Rejection List

Graduate Rejection 1

Rejection Sensitivity and the Rejection–Hostility Link in ......Rejection Sensitivity and the Rejection–Hostility Link in Romantic Relationships Rainer Romero-Canyas,1 Geraldine

Rejection - fr

Sensitivity to Status-Based Rejection: Implications for ... · rejection. To test the model, ... rejection sensitivity ... For. SENSITIVITY TO STATUS-BASED REJECTION. SENSITIVITY

Active Disturbance Rejection

Clutter Rejection