Upload
others
View
26
Download
4
Embed Size (px)
Citation preview
BIOSTATISTICS NURS 3324
57
Hypothesis Testing Hypothesis testing for a population mean (one sample test) A hypothesis is a statement about one (or more) population parameters.
There two hypotheses
1- Null hypothesis:
- The null hypothesis is the statement that there is "no effect" or "no difference",
that is why the word "null" is used. It is denoted by H0 and always contains the
equality "=" sign. It should be stressed that researchers frequently put forward a
null hypothesis in the hope that they can discredit it. The null hypothesis is always of the form
H0: Population parameter = specified number
2- Alternative hypothesis:
- The alternative hypothesis in turn is the "opposite" of the null hypothesis, that is,
there is an effect or difference. It is the hypothesis that we try to establish and is
denoted by Ha. It never contains the equality "=" sign.
The alternative hypothesis is one of the following cases
Either null hypothesis (H0) or the alternative
hypothesis (Ha) is true, but not both i.e. they
cannot simultaneously be true.
Steps of Hypothesis Testing (p-value approach)
The p-value is defined informally as the probability of obtaining the study results by
chance if the null hypothesis is true. When you perform a hypothesis test in statistics,
a p-value helps you determine the significance of your results.
1. Collect the data, i.e., obtain a random sample from the population(s) of interest.
2. Decide whether a one- or a two-tailed (sided) test is appropriate; this decision
depends on the research question.
Two tailed One tailed
Left-tailed Right -tailed
H0: µ = µ0 H0: µ = µ0 H0: µ = µ0
Ha: µ ≠ µ0 Ha: µ < µ0 Ha: µ > µ0
Note: µ0 is the assumed value of the population mean
3. State a Null (H0) and Alternative hypothesis (Ha).
4. Choose a level of significance ( = .001, .01, or .05)
5. Specify the rejection region/s.
The location of the rejection region depends on whether the test is one-tailed or
two-tailed.
a. For a one-tailed test in which the symbol ">" occurs in Ha, the rejection region
consists of area (=) in the upper tail of the sampling distribution.
Rejection region = a
if Ha: > 0
Rejection region = a
if Ha: > 0
BIOSTATISTICS NURS 3324
57
b. For a one-tailed test in which the symbol "<" appears in Ha, the rejection region
consists of area (=) in the lower tail of the sampling distribution.
c. For a two-tailed test, in which the symbol "" occurs in Ha, the rejection region
consists of two sets of areas (/2 in the upper tail+/2 in the lower tail = ).
6. Choose an appropriate test statistic that is reasonable in the context of the given
hypothesis test.
If the population has normal distribution with known variance
0
/
xz
n
If n is large (n ≥ 30), population is normal (or not normal), variance unknown
0
/
xz
s n
If n is small (n < 30), population is normal (or not normal), variance unknown
0
/
xt
s n
7. Calculate the test statistic.
8. Use the table (Standard normal distribution or student’s t-table) to find the
probability or the area/s in the tail beyond (above or below, depending on the
alternative hypothesis) the value of the test statistic, that is the p-value of the test.
Rejection region = a/2 +α/2 = α
if Ha: 0
α/2α/2
Rejection region = a/2 +α/2 = α
if Ha: 0
α/2α/2
Rejection region = a
if Ha: < 0
Rejection region = a
if Ha: < 0
BIOSTATISTICS NURS 3324
55
For two tailed test, the p-value is the total of the areas to the left and the right of the
test statistic.
9. Check the associated probability (p-value)
10. Making decision
√ If p-value falls within the rejection region i.e. p-value < , H0 is rejected at
the predetermined level, then we can say that the result is statistically
significant.
√ If p-value does not fall within the rejection region i.e. p-value ≥ , H0 is not
rejected at the predetermined level, then we can say that the result is
statistically not significant.
Researchers and statisticians generally agree on the following conventions for
interpreting p-values
p-value Result is:
p > 0.05 not significant
p ≤ 0.05 significant
p ≤ 0.01 highly significant
p ≤ 0.001 very highly significant
One-sample test of hypothesis about a population mean
The one sample test is a statistical procedure used to compares the mean of your
sample data to a known value.
Example 1 In 128 patients under 12 years of age with a particular congenital heart defect, the
mean intensive care unit stay after surgery was 4.7 days with a standard deviation of
7.8. Can we conclude that the average intensive care unit (ICU) stay of patients under
12 with this defect is more than 3.5 days? Use = 0.05
Solution
Note that the sample size n = 128 is sufficiently large so that the sampling distribution
of x is approximately normal and that s provides a good approximation to . Since
the required assumption is satisfied, we may proceed with a large-sample test of
hypothesis about .
1. Data: see the previous example
2. A one-sided test
3. Formulate the hypotheses as
H0: = 3.5
Ha: > 3.5
Z0 + test
statistic
-test
statistic
P value is the total
standard area
Z0 + test
statistic
-test
statistic
P value is the total
standard area
BIOSTATISTICS NURS 3324
57
4. The significance level = 0.05
5. Specify the rejection region We are dealing with one-tailed test in which the
symbol ">" occurs in Ha, the rejection region is an area of 0.05 in the upper tail of the
sampling distribution of the standardized test statistic.
Figure Rejection region for Example
6-7. Choose and Compute the value of the test statistic,
0 4.7 3.51.741
/ 7.8 / 128
xz
s n
8. Find the probability that the test statistic is in the tail beyond the calculated value
i.e. P (z > 1.741) = 1 – P (z ≤ 1.741), the p-value for this test is 1 – 0.9591 = 0.0409.
9. Check the associated probability (p-value). This value is < .
10. Decision Since the p-value fall within the rejection region i.e., <, we reject H0, at
0.05 level of , thus we can say that the result is statistically significant. Conclusion We say that there is sufficient evidence (at = 0.05) to conclude that the
mean intensive care unit stay after surgery is significantly greater than 3.5 days.
Example 2 Prior the developing of a new medicine or drug for specific parasite the
average number of that parasite per ml of feces was 4.5. To determine if the new
medicine has been effective in reducing the average number of parasite per ml of
feces, a random sample of 30 is taken from patients treated with the new medicine and
the number of parasite per ml of feces is recorded. The sample mean and standard
deviation were computed as follows: 3.17.3 sx
Is there sufficient evidence to conclude (at significance level 0.05) that the average
number of parasite per ml of feces has decreased since the administration of the new
medicine?
Solution
Note that the sample size n = 30 is sufficiently large so that the sampling distribution
of x is approximately normal and that s provides a good approximation to . Since
the required assumption is satisfied, we may proceed with a large-sample test of
hypothesis about .
1. Data See the previous example
2. A one-sided test
3. Formulate the hypotheses as
H0: = 4.5 (i.e., no change in average number of parasites per ml)
Ha: < 4.5 (i.e., average number of parasites per ml has decreased)
4. The significance level = .05
BIOSTATISTICS NURS 3324
57
5. The rejection region We are dealing with one-tailed test in which the symbol "<"
occurs in Ha, the rejection region consists of an area =0.05 in the lower tail of the
sampling distribution of the standardized test statistic. (see the next Figure)
Figure Location of rejection region
6-7. The value of the test statistic is computed as follows:
37.330/3.1
5.47.3
/
0
ns
xz
8. Find the probability that the test statistic is in the tail beyond the calculated value.
i.e. P (z < -3.37) = = 0.0004, this is the p-value.
9. Check the associated probability (p-value). This value is < .
10. Decision Since this area does fall within the rejection region, i.e., <, we reject
the null hypothesis and accept the alternative hypothesis at the significance level of
=.05.i.e. the result is statistically significant at significance level of =.05
Conclusion the average number of parasite per ml has significantly decreased since
the administration of the new medicine. It appears that the new medicine was
effective in reducing the average number of parasite per ml of feces.
Example 3 In a study on nutrition, a sample of 180 infants, age 13-24 months, from rural villages
had an average vitamin A intake of 90.6 and variance of 25. Test whether the true
average vitamin A intake of 13-24 months-old is different from 90. Use a level of
significance of 0.05.
Solution
The sample size (180) exceeds 30, we proceed with the larger sample test about .
Because shifts in in either direction are important, so the test is two-tailed.
1. Data: See the example
2. A two-tailed test
3. Formulate the hypotheses as
H0: = 90
Ha: 90
4. The significance level = .05
5. The rejection region We are dealing with two-tailed test, the rejection region
consists of two areas (each equal to 0.025) in the upper or lower tail of the
sampling distribution of the standardized test statistic.
BIOSTATISTICS NURS 3324
78
Figure Location of rejection region
6-7. The value of the test statistic is computed as follows:
61.1180/5
906.90
/
0
ns
xz
8. Find the probability that the test statistic is in the tail beyond the calculated value
Since this example is dealing with two-sided test, the p-value is computed as the sum
of the two tail areas. We go to the Z table and see what probability values are in the
two tails beyond the points Z = -1.61 and Z = 1.61. We find that 0.0537 is in the right
tail beyond Z = 1.61 and the same amount is in the left tail beyond Z = -1.61.
So the p-value of the result is 0.0537 + 0.0537 = 0.1074
9. Check the associated probability (p-value). 0.1074 is > .
10. Decision H0 is not rejected at the predetermined level. Consequently, the result
is statistically not significant.
Conclusion At α = 0.05, we can not conclude that the average vitamin A intake of 13-
24-month-old infants is different from 90, the recommended daily allowance for
vitamin A.
Hypothesis tests about the difference between two population means, 1 and 2
(Comparing two means)
The two sample test is a statistical procedure used to compares the means of two
samples.
Hypothesis
ONE -TAILED TEST
H0: 1 = 2
Ha: 1>2 or Ha: 1 < 2
TWO -TAILED TEST
H0: 1 = 2
Ha: 1 2
Test statistic: 1 2
2 2
1 2
1 2
x xz
s s
n n
Assumptions:
1. The sample sizes n1 and n2 are sufficiently large (n1 30 and n2 30).
2. The samples are selected randomly and independent from the target populations.
Example
In a study on pregnant women in their third trimester who delivered during Ramadan
or the first two weeks of Shawwal, the birthweight of the baby (in kg) was measured
for independent random samples of babies of fasting and non-fasting women. The
results of the investigation are summarized in the Table below. Does this evidence
BIOSTATISTICS NURS 3324
78
indicate that the mean of the baby of a non-fasting mother is significantly different
from the mean of the baby of a fasting mother? Use a significance level of = .05.
Non-fasting Fasting
n1 = 75
00.31 x
s1 = .11
n2 = 64
95.22 x
s2 = .09
Solution 1. Data See the previous example
2. A two-sided test
3. Formulate the hypotheses
The researcher wants to test the hypotheses
H0: (1 - 2) = 0 (i.e., no difference between means of babies)
Ha: (1 - 2) ≠ 0 (i.e., there is a difference between the mean weigh of babies of
fasting and non-fasting mothers)
where 1 = The mean of the baby of a non-fasting mother
2 = The mean of the baby of a fasting mother
4. The significance level = 0.05
5. Specify the rejection region
Figure Rejection region
We are dealing with two-tailed large-sample test in which the symbol "≠" occurs in
Ha, the rejection regions consist of two areas (each equal to 0.025) in the upper or
lower tail of the sampling distribution of the standardized test statistic.
6-7. This test is based on a z statistic. We compute the test statistic as follows:
947.2
64
)09(.
75
)11(.
0)95.200.3()(
22
2
2
2
1
2
1
021
n
s
n
s
Dxxz
8. Find the probability that the test statistic is in the tail beyond the calculated value
i.e. P (z > 2.947) = 1 – P (z ≤ 2.974) = 1 – 0.9984 = 0.0016.
the p-value for this test is 0.0016 + 0.0016 = 0.0032
9. Check the associated probability (p-value). This value is < .
10. Decision Since the p-value fall within the rejection region, we reject H0, at 0.05
level of , thus we can say that the result is statistically significant.
Z0
α/2 = .025
RejectReject
rejection region rejection region
1 – α = 0.95
α/2 = .025
BIOSTATISTICS NURS 3324
78
Student's t-distribution
The t-distribution is very much like the normal z-distribution. In particular, both are
symmetric, bell-shaped, and have a mean of 0. However, the distribution of t depends
on a quantity called its degrees of freedom (df), which is equal to (n - 1) when dealing
with one sample t-test, and (n1 + n2 - 2) when dealing with two samples t-test. A
portion of the value of t that located an area of in the upper tail of the t-
distribution for various values of and for degrees of freedom is represented in the
next table.
There are two cases where you must use the t-distribution instead of the Z-
distribution. The first case is where the sample size is small (below 30 or so), and
the second case is when the population standard deviation, σ is not known, and you
have to estimate it using the sample standard deviation, s.
One-sample t-test (n < 30) of hypothesis about a population mean.
ONE-TAILED TEST
H0: = 0
Ha: > 0 (or Ha: < 0)
TWO-TAILED TEST
H0: = 0
Ha: 0
Test statistic: 0
/
xt
s n
where the distribution of t is based on (n – 1) degrees of freedom.
Assumption: The relative frequency distribution of the population from which the
sample was selected is approximately normal.
BIOSTATISTICS NURS 3324
78
Example 4. Boys of a certain age have a mean weight of 85 lb (pound, 1 kg =
2.20462 pound). An observation was made that in a city neighborhood, children were
underfed. As evidence, a 25 boys in the neighborhood of that age were weighed and
found to have a mean x of 80.94 lb and a standard deviation s of 11.60 lb. Assume
that weights in the population are approximately normally distributed. Do the sample
data provide sufficient evidence for us to conclude that children in neighborhood city
were really underfed?
Solution
Note that the sample size n = 25 is < 30, and unknown. Since the population is
approximately normal we conduct a small-sample t test.
1. Data See the previous example
2. A one-sided test
3. Formulate the hypotheses as
H0: = 85
Ha: < 85
4. The significance level = 0.05
5. The rejection region We are dealing with one-tailed test in which the symbol "<"
occurs in Ha, the rejection region consists of an area = 0.05 in the lower tail of the
sampling distribution. (see the Figure below)
Figure Location of rejection region
6-7. The value of the test statistic is computed as follows:
0 80.94 851.75
11.60/
25
xt
s n
8-9. The exact p-value for this test cannot be obtained from Student's t distribution
table. The p-value can be stated as interval, however. In present example, ignoring the
sign of the t value, and entering the student's t distribution table at 24 degrees of
freedom, we find that 1.75 comes between probability values of 0.025 and 0.05, that
is, 0.025 < p-value < 0.05.
t0
Reject H0
BIOSTATISTICS NURS 3324
78
10. Decision Since the p-value is < α = 0.05, this area does fall within the rejection
region, we reject the null hypothesis and accept the alternative hypothesis at the
significance level of =.05.
Conclusion there is enough evidence to support the underfeeding complaint.
Example 5. Estimations of plasma calcium concentration in the 18 patients with
Everley's syndrome* gave a mean of 3.2 mmol/l, with standard deviation 1.1.
Previous experience from a number of investigations and published reports had shown
that the mean was commonly close to 2.5 mmol/l in healthy people aged 20-44, the
age range of the patients. Is the mean in these patients abnormally high? Assume that
plasma calcium concentration is approximately normally distributed.
Solution
In order to determine whether the mean in these patients abnormally high, we will
conduct a small-sample t test.
1. Data See the previous example
2. A one-sided test
3. Formulate the hypotheses as
H0: = 2.5
Ha: > 2.5
4. The significance level = 0.05
5. The rejection region We are dealing with one-tailed test in which the symbol ">"
occurs in Ha, the rejection region consists of an area of 0.05 in the upper tail of the
sampling distribution. (see the Figure below)
Figure Location of rejection region
6-7. The value of test statistic is
69.218/1.1
5.22.3
/
0
ns
xt
8-9 Entering the student's t distribution table at 17 degrees of freedom, we find that
2.69 comes between probability values of 0.005 and 0.01, i.e. 0.005 < p-value < 0.01.
10. Decision Since the p-value is < 0.05, this area does fall within the rejection
region, we reject the null hypothesis and accept the alternative hypothesis at the
significance level of =.05.
Conclusion we may conclude that the sample mean is, at least statistically, unusually
high. Whether it should be regarded clinically as abnormally high is something that
needs to be considered separately by the physician in charge of that case.
t0
Reject H0
BIOSTATISTICS NURS 3324
77
Example 6.
A manufacturer of Glucosamine capsules claims that each capsule contains on the
average 1500 mg of glucosamine. To test this claim a sample of 10 was selected and
amount of glucosamine was measured in each capsule. The sample showed a mean of
1410 mg and standard deviation of 90 mg. Test the hypothesis that the mean amount
of glucosamine has not changed, using a level of significance of = .05. Assume
that the amount of the glucosamine have a distribution that is approximately normal.
Solution
Note that the sample size n = 10 is < 30, and unknown. Since the population is
approximately normal we conduct a small-sample t test.
1. Data See the previous example
2. A two-tailed test since the question asks us to test that the mean has not changed,
3. Formulate the hypotheses as
H0: = 1500
Ha: 1500
4. The significance level = 0.05
5. The rejection region We are dealing with two-tailed test, the rejection region
consists of two areas (each equal to 0.025) in the upper and lower tail of the sampling
distribution.
6. The value of test statistic is
999.210/90
15001410
/
0
ns
xt
7. Ignoring the sign and entering the student's t distribution table at 9 degrees of
freedom, we find that 2.999 comes between probability values of 0.005 and 0.01.
Since we are dealing with two tailed test, the p-value is the total of the area to the left
of the test statistic added to the area to the right of the test statistic. Accordingly,
0.005×2 < p-value < 0.01×2, in other words 0.01 < p-value < 0.02.
8. Decision Since the p-value is < α = 0.05, this area does fall within the rejection
region, we reject H0 and accept Ha at significance level of .05,
9. Conclusion It is therefore unlikely that the sample with mean 1410 came from the
population with mean 1500. We conclude that there is some evidence to suggest that
the mean amount of glucosamine has changed.
t0
α/2 = .025
RejectReject
rejection region rejection region
1 – α = 0.95
α/2 = .025
t0
α/2 = .025
RejectReject
rejection region rejection region
1 – α = 0.95
α/2 = .025
BIOSTATISTICS NURS 3324
77
Two sample t-test
The two sample t-test is used to compare the means in two independent groups of
observations using representative samples.
The test statistic for two sample t-test is:
1 2
2
1 2
1 1p
x xt
sn n
Where 2
ps is the pooled variance = 2
)1()1(
21
2
22
2
112
nn
snsns p
and the distribution of t is based on (n1 + n2 - 2) degrees of freedom.
Assumptions:
The validity of the two-sample t-test depends on various assumptions being satisfied:
1. Each subject must be randomly selected from the population.
2. The random samples are selected in an independent manner from the two
populations.
3. The populations from which the samples are selected both have approximately
normal distributions.
4. The two samples come from populations with equal (or approximately equal)
variances (or standard deviations).
Example The addition of bran to the diet has been reported to benefit patients with
diverticulosis (disease of the colon). Several different bran preparations are available,
and a clinician wants to test the efficacy of two of them on patients, since favorable
claims have been made for each. Among the consequences of administering bran that
requires testing is the transit time through the alimentary canal.
The clinician selects two groups of patients aged 40-64 with diverticulosis of
comparable severity. Sample 1 contains 12 patients who are given treatment A, and
sample 2 contains 15 patients who are given treatment B. The transit times of food
through the gut are measured by a standard technique with marked pellets and the
results are recorded, in order of increasing time, in the next table.
Table Transit times of marker pellets through the alimentary canal of patients with diverticulosis
on two types of treatment: unpaired comparison
Transit times (h)
Sample 1 (Treatment A) Sample 2 (Treatment B)
52 44
64 51
68 52
74 55
79 60
83 62
84 66
88 68
BIOSTATISTICS NURS 3324
75
95 69
97 71
101 71
116 76
82
91
108
Total 1001 1026
Mean 83.42 68.40
Does the transit time through the alimentary canal differ in the two groups of
patients taking these two preparations?
Consider the following:
The samples were independently and randomly selected. The populations of transit
times of food at both treatments have approximately normal distributions and their
variances are equal.
Solution
Summary of test statistics
Treatment A Treatment B
n1 = 12 n2 = 15
1x 83.4167 2x 68.4
s1 = 17.63 s2 = 16.47 2
1s = 310.99 2
2s = 271.4
1. Data See the previous example
2. A two-sided test
3. Formulate the hypotheses
The researcher wants to test the following hypothesis:
H0: (1 - 2) = 0 (i.e., no difference between mean transit times of food between
treatment A and B)
Ha: (1 - 2) 0 (i.e., mean transit times of food between treatment A and B are
different)
where 1 and 2 are the true mean transit times of food of treatment A and B,
respectively.
4. The significance level = 0.05
5. The rejection region Since it is possible for the difference in mean transit times
for A-B to be positive or negative, we will employ a two sided test. The rejection
region consists of two areas (each equal to 0.025) in the upper and lower tail of
the sampling distribution.
BIOSTATISTICS NURS 3324
77
6-7. The test statistic is 1 2
2
1 2
1 1p
x xt
sn n
where 2
ps is the pooled variance and equal to 2
)1()1(
21
2
22
2
112
nn
snsns p
67.88221512
)4.271)(115()99.310)(112(
2
)1()1(
21
2
22
2
112
nn
snsns p
Using this pooled sample variance in the computation of the test statistic, we obtain
282.2
15
1
12
167.288
)4.684167.83(
11
)( 0
21
2
021
D
nns
Dxxt
p
8. The exact p-value for this test cannot be obtained from Student's t distribution
table. The p-value can be stated as interval, however. In present example, entering the
student's t distribution table at 25 degrees of freedom {that is (n1 + n2 - 2) = (12 + 15 –
2 = 25)}, we find that 2.282 lies between 2.167 and 2.485. In other words it found
between the probability values of 0.01 and 0.02. Since we are dealing with two tailed
test, the p-value is the total of the area to the left of the test statistic added to the area
to the right of the test statistic. Accordingly, 0.01×2 < p-value < 0.02×2 i.e. 0.02 < p-
value < 0.04. This degree of probability is smaller than the conventional level of 0.05.
9. Decision The null hypothesis that there is no difference between the means is
therefore somewhat unlikely.
t0
α/2 = .025
RejectReject
rejection region rejection region
1 – α = 0.95
α/2 = .025
t0
α/2 = .025
RejectReject
rejection region rejection region
1 – α = 0.95
α/2 = .025