View
1
Download
0
Category
Preview:
Citation preview
---לא מסווג--- ---לא מסווג---
Quantitative Methods2013
1
Hypothesis Testing with One Sample
---לא מסווג---
2
Concept of Hypothesis Testing
Testing Hypotheses is another way to deal with the problem of making a statement about an unknown population parameter, based on a random sample.
Instead of finding an estimate for the parameter, we can often find it convenient to hypothesize a value for it and then use the information from the sample to confirm or refute the hypothesized value.
---לא מסווג---
3
A hypothesis is a judgment about the population parameter based simply on an assumption or intuition with no concrete backup information or analysis.
For example:‐ population mean
The mean monthly cell phone bill in Haifa is
μ ≤ $42
---לא מסווג---
4
Hypothesis testing
Hypothesis testing is the comparison of the analyst's belief or claim about a population parameter to the corresponding sample statistic and deciding whether or not the belief or claim about the population parameter is correct.
---לא מסווג---
5
Example: Filling of soft drink bottles in a bottling plant.
Managers have the belief that the machine is filling 1L bottles with 1L of the soft drink.
This belief is tested by drawing a sample of filled bottles, measuring the amount of soft drink in each, finding the mean and variance of the amount of soft drink in sample of bottles, comparing this amount with the 1L belief,
and finally deciding whether the information from the sample supports or refutes the 1L belief.
---לא מסווג---
6
Stating a Hypothesis
The Null Hypothesis, H0
This is the hypothesis or claim that is initially assumed to be true.
It is what is believed to be, presumed to be, or accepted as true and correct without any evidentiary support.
In the bottling scenario, managers believe the machine is filling the bottles correctly even though they have no evidence to support that belief.
The null hypothesis in that case is that the bottling machine is functioning correctly.
---לא מסווג---
7
The Null Hypothesis, H0
Example: The average number of TV sets in U.S. homes is equal to three ( )
Always contains “=” , “≤” or “≥” sign.
Is always about a population parameter, not about a sample statistic
3µ:H0 = 3X:H0 =
3µ:H0 =
---לא מסווג---
8
The Alternative Hypothesis, H1 , (Ha)
This is the hypothesis or claim which we initially assume to be false but which we may decide to accept if there is sufficient evidence.
In the bottling scenario, managers will not believe the machine is filling the bottles incorrectly unless they have evidence to support that belief.
It is the sample which provides this evidence.The alternative hypothesis in that case is that the bottling machine is not functioning correctly.
---לא מסווג---
9
The Alternative Hypothesis, H1 , (Ha)
Is the opposite (the complement) of the null hypothesis
e.g., The average number of TV sets in U.S. homes is not equal to 3 ( H1: μ ≠ 3 )
Never contains the “=” , “≤” or “≥” sign
---לא מסווג---
10
This procedure is familiar to us already from the legal system:
“Innocent until proven guilty”.
The Null Hypothesis (Innocent) is onlyrejected in favor of the Alternative Hypothesis (Guilty)if there is sufficient evidence of this.
---לא מסווג---
11
Example:
A weight loss company claims its product, when used as directed, will lead to at least 10 kilograms of weight loss within 45 days.
A consumer protection agency wants to verify the company's claim in order to protect the public from fraud.
What is the hypothesis statement the agency director should use?
---לא מסווג---
12
H0:Ha: µ < 10kg
µ ≥ 10kg
Complement of the null hypothesis
Since the consumer protection agency wants evidence that the weight loss product does not performs as claimed, yields less than 10 kg of weight loss.
The entire hypothesis statement is:
---לא מסווג---
13
Rejection Regions
One way of performing a Hypothesis test is tocompute a rejection region.
If we find that our sample statistic is in this region then we reject our Null Hypothesis.
---לא מסווג---
14
What is important is the idea that the rejection region is a region far away from our Null hypothesis.
And that it is unlikely that we would observe a sample with a value of the sample statistic (for example the sample mean) this far away from the Null Hypothesised value if that Null Hypothesis was true.
X
---לא מסווג---
15
For example in a class of 100 people.
Suppose our Null Hypothesis is that the average age of the whole class is 31.
Ho: thepopulationmean age is 31.
H0: µ = 31
Ha: µ ≠ 31
---לא מסווג---
16
However, suppose we now observe a value of 20 in a sample we have randomly chosen. That’s very unlikely to have happened by chance if our Null Hypothesis was true. It’s much more likely that our Null Hypothesis is false.
If we took a sample from our population and the average age is 33 year we might say that average age of the population is not significantly different from 35.In this case we would accept the Null Hypothesis as being correct.
---לא מסווג---
17
Random sample
Supposethe samplemean age is 20: X = 20
20likely if μ = 31?
=Is X
REJECT
Null Hypothesis
If not likely,
So we decide to reject our Null Hypothesis in favourof an alternative, which is that the true average age of the whole class is different from 31 years.
---לא מסווג---
18
Sampling Distribution of X
μ = 31If H0 is true
we reject the null hypothesis that μ = 31 if sample mean falls in either of these regions.
20X
If the sample mean falls in this region we accept the null hypothesis
---לא מסווג---
19
Rejection Regions and Critical Values
A rejection region (or critical region) of the sampling distribution is the range of values for which the null hypothesis is not probable.
A critical value separates the rejection region from the nonrejection region.
---לא מסווג---
20
Level of Significance, α
defines the unlikely values of the sample statistic if the
null hypothesis is true.
– Defines rejection region of the sampling distribution
α
Typical values are 0.01, 0.05, or 0.10
---לא מסווג---
21
Level of Significance, α
is selected by the researcher at the beginning
By setting the level of significance at a small value, you are saying that you want the probability of rejecting a true null hypothesis to be small.
Provides the critical value(s) of the test.
α
---לא מסווג---
22
There are three types of hypothesis tests –a left-, right-, or two-tailed test.
The type of test depends on the region of the sampling distribution that favors a rejection of H0. This region is indicated by the alternative hypothesis.
/2α/2α
α
0
α
---לא מסווג---
23
Left‐tailed Test
1. If the alternative hypothesis contains the less‐than inequality symbol (<), the hypothesis test is a left‐tailed test.
Level of significance = αRepresents critical value.
H0: μ ≥ k
H1: μ < k0
α
---לא מסווג---
24
2. If the alternative hypothesis contains the greater‐than symbol (>), the hypothesis test is a right‐tailed test.
Right-tailed Test
α
0
H0: μ ≤ k
Ha: μ > k
---לא מסווג---
25
/2α/2α
Two-tailed Test
3. If the alternative hypothesis contains the not‐equal‐to symbol (≠), the hypothesis test is a two‐tailed test. In a two‐tailed test, each tail has an area of α/2.
H0: μ = k
H1: μ ≠ k
---לא מסווג---
26
Statistical Tests
The statistic that is compared with the parameter in the null hypothesis is called the test statistic.
After stating the null and alternative hypotheses and specifying the level of significance,a random sample is taken from the population and sample statistics are calculated.
A test statistic tells us how far, or how many standard deviations, a sample mean is from the population mean. The larger the value of the test statistic, the further the distance, or number of standard deviations, a sample mean is from the population mean stated in the null hypothesis.
---לא מסווג---
27
If the population standard deviation is known, and the population is (approx.) Normally distributed then the test statistic, is,
−= x µz σn
μ is the hypothesized population mean.
X is the sample mean.
σx is the population standard deviation.
n is the sample size.
is the standard error.σn
z, is how many standard errors, the observed sample mean is
from the hypothesized mean.
---לא מסווג---
28
If the population standard deviation is unknown then the only standard deviation we can determine is the sample standard deviation, s.
This value of s can be considered an estimateof the population standard deviation.
If the sample size is more than 30 we can useNormal Distribution.
Central Limit Theorem
−= x µz sn
---לא מסווג---
29
Example:Find the critical value and rejection region for a righttailed test with α = 0.5%.
z0 2.575
The rejection region is to the right of z0 = 2.575.
---לא מסווג---
30
Decision Rule Based on Rejection RegionTo use a rejection region to conduct a hypothesis test, calculate the test statistic, z.
If the test statistic1. is in the rejection region, then reject H0.2. is not in the rejection region, then fail to reject H0.
z0 z0
Reject Ho.
Fail to reject Ho.
z > z0
Right-Tailed Test
---לא מסווג---
31
z
0−z0
Two-Tailed Test
z0z < z0 z > z0
Reject Ho.
Fail to reject Ho.
Reject Ho.
z
0z0
Fail to reject Ho.
Reject Ho.
Left-Tailed Test
z < z0
---לא מסווג---
32
Hypothesis Testing Example
Test the claim that the true mean # of TV sets in US homes is equal to 3.
(Assume σ = 0.8)
1. State the appropriate null and alternativehypotheses
H0: μ = 3 H1: μ ≠ 3 (This is a two‐tail test)
2. Specify the desired level of significance and the
sample size
Suppose that α = 0.05 and n = 100 are chosen for this test
---לא מסווג---
33
3. Determine the appropriate techniqueσ is known, so this is a Z test.
4. Determine the critical valuesAt a significance level of 5% for the test of adifference there is 2.5% in each tail.
Using function NORMSINV in Excel this gives acritical value of z of 1.96.
---לא מסווג---
34
5. Collect the data and compute the test statistic
Suppose the sample results are
n = 100, X = 2.84 (σ = 0.8 is assumed known)
So the test statistic is:
2.0.08
.16
100
0.832.84
n
σµX
Z −=−
=−
=−
=
---לא מסווג---
35
6. Is the test statistic in the rejection region?
Reject H0 Do not reject H0
α = 0.05/2
‐Z= ‐1.96 0Reject H0 if
Z < ‐1.96 or Z > 1.96;
otherwise do not reject H0
α = 0.05/2
Reject H0
+Z= +1.96
Here, Z = ‐2.0 < ‐1.96, so the test statistic is in the rejection region
---לא מסווג---
36
Since Z = ‐2.0 < ‐1.96, we reject the null hypothesis and conclude that there is sufficient evidence that the mean number of TVs in US homes is not equal to 3.
---לא מסווג---
For X = 2.84, σ = 0.8 and n = 100, the 95% confidence interval is:
[2.6832, 2.9968]
Since this interval does not contain the hypothesized mean (3.0), we reject the null hypothesis at α = 0.05
37
Connection to Confidence Intervals
100
0.8 (1.96) 2.84 to
100
0.8 (1.96) - 2.84 +
---לא מסווג---
38
Example:
A local telephone company believes that the average length of a phone call is 8 minutes.
In a random sample of 58 phone calls, the sample mean was 7.8 minutes and the standard deviation was 0.5 minutes.
Is there enough evidence to reject the Null atα = 0.05?
Ha: µ ≠ 8H0: µ = 8
---לא מסווג---
39
−z0 = −1.96
z
0 z0 = 1.96
0.025 0.025
At a significance level of 5% for the test of adifference there is 2.5% in each tail.
Using function NORMSINV in Excel this gives acritical value of z of 1.96.
---לא מסווג---
40
−z0 = −1.96
z
0 z0 = 1.96
0.025 0.025
The standardized test statistic is
7.8 80.5 58
−=x µzσ n
−=
3.05.≈ −
The test statistic falls in the rejection region, so H0 is rejected.
At the 5% level of significance, there is enough evidence to reject the claim that the average length of a phone call is 8 minutes.
---לא מסווג---
41
Example: Right‐Tail Z Test for Mean
A phone industry manager thinks that customer monthly cell phone bills have increased, and now average over $52 per month. The company wishes to test this claim. (Assume σ = 10 is known)
Form hypothesis test:
H0: μ ≤ 52 the average is not over $52 per month
H1: μ > 52 the average is greater than $52 per month(i.e., sufficient evidence exists to support the manager’s claim)
---לא מסווג---
42
Find Rejection Region
Suppose that α = 0.10 is chosen for this test
α = 0.10
1.280
For a one‐tail test, at a significance level of 10% for the test there is 10% in the right tail. The area of the curve for the upper level is 100% ‐ 10% or 90.00%. Using [function NORMSINV] in Excel this gives
a critical value of z of 1.28.
---לא מסווג---
43
Reject H0Do not reject H0
α = 0.10
1.280
Reject H0
Reject H0 if Z > 1.28
Find Rejection Region
---לא מסווג---
44
Obtain sample and compute the test statistic
Suppose a sample is taken with the following results:
n = 64, X = 53.1 (σ=10 was assumed known)
– Then the test statistic is:
X µ 53.1 52Z 0.88
σ 10
n 64
− −= = =
---לא מסווג---
45
Reject H0Do not reject H0
α = 0.10
1.280
Reject H0
Do not reject H0 since Z = 0.88 ≤ 1.28
i.e.: there is not sufficient evidence that themean bill is over $52
Z = 0.88
Reach a decision and interpret the result:
---לא מסווג---
46
t-Test for a Mean µ (n < 30, σ Unknown)
The t-test can be used when the population is normal or nearly normal, σ is unknown, and n < 30.
x µts n
−=
The degrees of freedom are d.f. = n – 1 .
---לא מסווג---
47
Example:
Find the critical values t0 and −t0 for a two‐tailed test given α = 0.10 and n = 12.
The degrees of freedom are d.f. = n – 1 = 12 – 1 = 11.
Because the test is a two‐tail test, one critical value is negative and one is positive.
−t0 = − 1.796 and t0 = 1.796
---לא מסווג---
48
Example:
Find the critical value t0 for a right‐tailed test givenα = 0.05 and n = 11.
The degrees of freedom are d.f. = n – 1 = 11 – 1 = 10.
Because the test is a right‐tail test, the critical value is positive.
t0 = 1.81
Using function TINV from Excel the t‐value is 1.81.
---לא מסווג---
49
A one‐tailed t‐value can be returned by replacing probability with 2*probability.
For a probability of 0.05 and degrees of freedom of 10, the two‐tailed value is calculated with TINV(0.05,10), which returns 2.28139.
The one‐tailed value for the same probability and degrees of freedom can be calculated with TINV(2*0.05,10), which returns 1.812462.
TINV function returns the two‐tailed inverse of the t‐distribution.
---לא מסווג---
50
The average cost of a hotel room in Tel Aviv is said to be $168 per night.
A random sample of 25 hotels resulted in X = $172.50 and S = $15.40.
Test at the α = 0.05 level.
(Assume the population distribution is normal)
H0: μ = 168
H1: μ ≠ 168
Example:
---לא מסווג---
51
• α = 0.05
• n = 25
• σ is unknown, n<30 so use a t statistic
• Critical Value: t24 = ± 2.0639
Reject H0Reject H0
α/2=.025
‐t n‐1,α/2Do not reject H0
0
α/2=.025
‐2.0639 2.0639
t n‐1,α/2
---לא מסווג---
52
1.46
25
15.40168172.50
n
SµX
t 1n =−
=−
=−
Do not reject H0: not sufficient evidence that true mean cost is different than $168
Reject H0Reject H0
α/2=.025
‐t n‐1,α/2Do not reject H0
0
α/2=.025
‐2.0639 2.0639
t n‐1,α/2
---לא מסווג---
For X = 172.5, S = 15.40 and n = 25, the 95% confidence interval is:
172.5 ‐ (2.0639) 15.4/ 25 to 172.5 + (2.0639) 15.4/ 25
[166.14 ,178.86]
Since this interval contains the Hypothesized mean (168), we do not reject the null hypothesis at α = 0.05
53
Connection to Confidence Intervals
---לא מסווג---
54
A certain country has made its budget on the bases that the average individual average tax payments for the year will be $30,000.
The financial controller takes a random sample of annual tax returns and these amounts in dollars are as follows.
Example:
34,000 12,000 16,000 10,0002,000 39,000 7,000 72,00024,000 15,000 19,000 12,00023,000 14,000 6,000 43,000
---לא מסווג---
55
At a significance level, α, of 5% is there evidencethat the average tax returns of the state will bedifferent than the budget level of $30,000 inthis year?
The null and alternative hypotheses are as follows:
Null hypothesis: H0: μ = $30,000.Alternative hypothesis: H1: μ ≠ $30,000.
Since we have no information of the populationstandard deviation, and the sample size isless than 30, we use a t distribution (assuming the population is normally distributed).
---לא מסווג---
56
Sample size, n, is 16.Degrees of freedom, (n‐1) are 15.
Using [function TINV] from Excel the t value is 2.1315. Therefore, ±2.1315 are the critical values.
From Excel, using [function AVERAGE]:Mean value of this sample data is $21,750.00.
---לא מסווג---
57
From [function STDEV] in Excel, the samplestandard deviation, s, is $17,815.72
Estimate of the standard error is,
−= = -1.8523x µts n
= =17,815.72 4,453.9316
s n
From equation the test statistic is,
---לא מסווג---
58
Since the sample statistic, 1.8523, is not less than the test statistic of 2.1315,there is no reason to reject the null hypothesis and sowe accept that there is no evidence that theaverage of all the tax receipts will be significantlydifferent from $30,000.
---לא מסווג---
59
Example:A local telephone company claims that the average length of a phone call is 8 minutes. In a random sample of 18 phone calls, the sample mean was 7.8 minutes and the standard deviation was 0.5 minutes. Is there enough evidence to reject this claim at α = 0.05?
Ha: µ ≠ 8H0: µ = 8
The level of significance is α = 0.05.
The test is a two-tailed test.
Degrees of freedom are d.f. = 18 – 1 = 17.
---לא מסווג---
60
The critical values are −t0 = −2.110 and t0 = 2.110
−t0 = −2.110
z
0 t0 = 2.110
The standardized test statistic is
7.8 80.5 18
−=x µts n
−=
1.70.≈ −
The test statistic falls in the nonrejection region, so H0 is not rejected.
At the 5% level of significance, there is not enough evidence to reject the claim that the average length of a phone call is 8 minutes.
---לא מסווג---
61
6 Steps in Hypothesis Testing
1. State the null hypothesis, H0 and the alternative hypothesis, H1
2. Choose the level of significance, α, and the sample size, n
3. Determine the appropriate test statistic and sampling distribution
4. Determine the critical values that divide the rejection and nonrejection regions
---לא מסווג---
62
5. Collect data and compute the value of the test statistic.
6. Make the statistical decision. If the test statistic falls into the nonrejection region, do not reject the null hypothesis H0. If the test statistic falls into the rejection region, reject the null hypothesis.
6 Steps in Hypothesis Testing
---לא מסווג---
63
APPENDIX
---לא מסווג---
64
Types of Errors
g
No matter which hypothesis represents the claim,always begin the hypothesis test assuming that the null hypothesis is true.
At the end of the test, one of two decisions will be made:
1. reject the null hypothesis, or
2. fail to reject the null hypothesis.
A Type I error occurs if the null hypothesis is rejected when it is true.
A Type II error occurs if the null hypothesis is not rejected when it is false.
---לא מסווג---
65
Actual Truth of H0
H0 is true H0 is false
Do not reject H0
Reject H0
Correct Decision
Correct Decision
Type II Error
Type I Error
Decision
Actual Truth of H0
H0 is true H0 is false
Do not reject H0
Reject H0
Correct Decision
Correct Decision
Type II Error
Type I Error
Decision
The probability of Type I Error is α
---לא מסווג---
66
Example:The Haifa Technion claims that 94% of their graduates find employment within six months of graduation. What will a type I or type II error be?
H0:H1: p ≠ 0.94
p = 0.94 (Claim)
A Type I error is rejecting the null when it is true. The population proportion is actually 0.94, but is rejected. (We believe it is not 0.94.)
A Type II error is failing to reject the null when it is false. The population proportion is not 0.94, but is not rejected. (We believe it is 0.94.)
Recommended