Hypothesis Testing - Islamic University of Gazasite.iugaza.edu.ps/.../2018/02/11_Hypothesis-testing.pdf · 2018. 2. 10. · Hypothesis Testing Hypothesis testing for a population

BIOSTATISTICS NURS 3324

57

Hypothesis Testing Hypothesis testing for a population mean (one sample test) A hypothesis is a statement about one (or more) population parameters.

There two hypotheses

1- Null hypothesis:

- The null hypothesis is the statement that there is "no effect" or "no difference",

that is why the word "null" is used. It is denoted by H0 and always contains the

equality "=" sign. It should be stressed that researchers frequently put forward a

null hypothesis in the hope that they can discredit it. The null hypothesis is always of the form

H0: Population parameter = specified number

2- Alternative hypothesis:

- The alternative hypothesis in turn is the "opposite" of the null hypothesis, that is,

there is an effect or difference. It is the hypothesis that we try to establish and is

denoted by Ha. It never contains the equality "=" sign.

The alternative hypothesis is one of the following cases

Either null hypothesis (H0) or the alternative

hypothesis (Ha) is true, but not both i.e. they

cannot simultaneously be true.

Steps of Hypothesis Testing (p-value approach)

The p-value is defined informally as the probability of obtaining the study results by

chance if the null hypothesis is true. When you perform a hypothesis test in statistics,

a p-value helps you determine the significance of your results.

1. Collect the data, i.e., obtain a random sample from the population(s) of interest.

2. Decide whether a one- or a two-tailed (sided) test is appropriate; this decision

depends on the research question.

Two tailed One tailed

Left-tailed Right -tailed

H0: µ = µ0 H0: µ = µ0 H0: µ = µ0

Ha: µ ≠ µ0 Ha: µ < µ0 Ha: µ > µ0

Note: µ0 is the assumed value of the population mean

3. State a Null (H0) and Alternative hypothesis (Ha).

4. Choose a level of significance ( = .001, .01, or .05)

5. Specify the rejection region/s.

The location of the rejection region depends on whether the test is one-tailed or

two-tailed.

a. For a one-tailed test in which the symbol ">" occurs in Ha, the rejection region

consists of area (=) in the upper tail of the sampling distribution.

Rejection region = a

if Ha: > 0


if Ha: > 0


57

b. For a one-tailed test in which the symbol "<" appears in Ha, the rejection region

consists of area (=) in the lower tail of the sampling distribution.

c. For a two-tailed test, in which the symbol "" occurs in Ha, the rejection region

consists of two sets of areas (/2 in the upper tail+/2 in the lower tail = ).

6. Choose an appropriate test statistic that is reasonable in the context of the given

hypothesis test.

If the population has normal distribution with known variance

0

/

xz

n

If n is large (n ≥ 30), population is normal (or not normal), variance unknown

0

/

xz

s n

If n is small (n < 30), population is normal (or not normal), variance unknown

0

/

xt

s n

7. Calculate the test statistic.

8. Use the table (Standard normal distribution or student’s t-table) to find the

probability or the area/s in the tail beyond (above or below, depending on the

alternative hypothesis) the value of the test statistic, that is the p-value of the test.

Rejection region = a/2 +α/2 = α

if Ha: 0

α/2α/2

Rejection region = a/2 +α/2 = α

if Ha: 0

α/2α/2


if Ha: < 0


if Ha: < 0


55

For two tailed test, the p-value is the total of the areas to the left and the right of the

test statistic.

9. Check the associated probability (p-value)

10. Making decision

√ If p-value falls within the rejection region i.e. p-value < , H0 is rejected at

the predetermined level, then we can say that the result is statistically

significant.

√ If p-value does not fall within the rejection region i.e. p-value ≥ , H0 is not

rejected at the predetermined level, then we can say that the result is

statistically not significant.

Researchers and statisticians generally agree on the following conventions for

interpreting p-values

p-value Result is:

p > 0.05 not significant

p ≤ 0.05 significant

p ≤ 0.01 highly significant

p ≤ 0.001 very highly significant

One-sample test of hypothesis about a population mean

The one sample test is a statistical procedure used to compares the mean of your

sample data to a known value.

Example 1 In 128 patients under 12 years of age with a particular congenital heart defect, the

mean intensive care unit stay after surgery was 4.7 days with a standard deviation of

7.8. Can we conclude that the average intensive care unit (ICU) stay of patients under

12 with this defect is more than 3.5 days? Use = 0.05

Solution

Note that the sample size n = 128 is sufficiently large so that the sampling distribution

of x is approximately normal and that s provides a good approximation to . Since

the required assumption is satisfied, we may proceed with a large-sample test of

hypothesis about .

1. Data: see the previous example

2. A one-sided test

3. Formulate the hypotheses as

H0: = 3.5

Ha: > 3.5

Z0 + test

statistic

-test

statistic

P value is the total

standard area

Z0 + test

statistic

-test

statistic

P value is the total

standard area


57

4. The significance level = 0.05

5. Specify the rejection region We are dealing with one-tailed test in which the

symbol ">" occurs in Ha, the rejection region is an area of 0.05 in the upper tail of the

sampling distribution of the standardized test statistic.

Figure Rejection region for Example

6-7. Choose and Compute the value of the test statistic,

0 4.7 3.51.741

/ 7.8 / 128

xz

s n

8. Find the probability that the test statistic is in the tail beyond the calculated value

i.e. P (z > 1.741) = 1 – P (z ≤ 1.741), the p-value for this test is 1 – 0.9591 = 0.0409.

9. Check the associated probability (p-value). This value is < .

10. Decision Since the p-value fall within the rejection region i.e., <, we reject H0, at

0.05 level of , thus we can say that the result is statistically significant. Conclusion We say that there is sufficient evidence (at = 0.05) to conclude that the

mean intensive care unit stay after surgery is significantly greater than 3.5 days.

Example 2 Prior the developing of a new medicine or drug for specific parasite the

average number of that parasite per ml of feces was 4.5. To determine if the new

medicine has been effective in reducing the average number of parasite per ml of

feces, a random sample of 30 is taken from patients treated with the new medicine and

the number of parasite per ml of feces is recorded. The sample mean and standard

deviation were computed as follows: 3.17.3 sx

Is there sufficient evidence to conclude (at significance level 0.05) that the average

number of parasite per ml of feces has decreased since the administration of the new

medicine?

Solution

Note that the sample size n = 30 is sufficiently large so that the sampling distribution

of x is approximately normal and that s provides a good approximation to . Since

the required assumption is satisfied, we may proceed with a large-sample test of

hypothesis about .

1. Data See the previous example

2. A one-sided test


H0: = 4.5 (i.e., no change in average number of parasites per ml)

Ha: < 4.5 (i.e., average number of parasites per ml has decreased)

4. The significance level = .05


57

5. The rejection region We are dealing with one-tailed test in which the symbol "<"

occurs in Ha, the rejection region consists of an area =0.05 in the lower tail of the

sampling distribution of the standardized test statistic. (see the next Figure)

Figure Location of rejection region

6-7. The value of the test statistic is computed as follows:

37.330/3.1

5.47.3

/

0

ns

xz

8. Find the probability that the test statistic is in the tail beyond the calculated value.

i.e. P (z < -3.37) = = 0.0004, this is the p-value.


10. Decision Since this area does fall within the rejection region, i.e., <, we reject

the null hypothesis and accept the alternative hypothesis at the significance level of

=.05.i.e. the result is statistically significant at significance level of =.05

Conclusion the average number of parasite per ml has significantly decreased since

the administration of the new medicine. It appears that the new medicine was

effective in reducing the average number of parasite per ml of feces.

Example 3 In a study on nutrition, a sample of 180 infants, age 13-24 months, from rural villages

had an average vitamin A intake of 90.6 and variance of 25. Test whether the true

average vitamin A intake of 13-24 months-old is different from 90. Use a level of

significance of 0.05.

Solution

The sample size (180) exceeds 30, we proceed with the larger sample test about .

Because shifts in in either direction are important, so the test is two-tailed.

1. Data: See the example

2. A two-tailed test


H0: = 90

Ha: 90

4. The significance level = .05

5. The rejection region We are dealing with two-tailed test, the rejection region

consists of two areas (each equal to 0.025) in the upper or lower tail of the

sampling distribution of the standardized test statistic.


78



61.1180/5

906.90

/

0

ns

xz


Since this example is dealing with two-sided test, the p-value is computed as the sum

of the two tail areas. We go to the Z table and see what probability values are in the

two tails beyond the points Z = -1.61 and Z = 1.61. We find that 0.0537 is in the right

tail beyond Z = 1.61 and the same amount is in the left tail beyond Z = -1.61.

So the p-value of the result is 0.0537 + 0.0537 = 0.1074

9. Check the associated probability (p-value). 0.1074 is > .

10. Decision H0 is not rejected at the predetermined level. Consequently, the result

is statistically not significant.

Conclusion At α = 0.05, we can not conclude that the average vitamin A intake of 13-

24-month-old infants is different from 90, the recommended daily allowance for

vitamin A.

Hypothesis tests about the difference between two population means, 1 and 2

(Comparing two means)

The two sample test is a statistical procedure used to compares the means of two

samples.

Hypothesis

ONE -TAILED TEST

H0: 1 = 2

Ha: 1>2 or Ha: 1 < 2

TWO -TAILED TEST

H0: 1 = 2

Ha: 1 2

Test statistic: 1 2

2 2

1 2

1 2

x xz

s s

n n

Assumptions:

1. The sample sizes n1 and n2 are sufficiently large (n1 30 and n2 30).

2. The samples are selected randomly and independent from the target populations.

Example

In a study on pregnant women in their third trimester who delivered during Ramadan

or the first two weeks of Shawwal, the birthweight of the baby (in kg) was measured

for independent random samples of babies of fasting and non-fasting women. The

results of the investigation are summarized in the Table below. Does this evidence


78

indicate that the mean of the baby of a non-fasting mother is significantly different

from the mean of the baby of a fasting mother? Use a significance level of = .05.

Non-fasting Fasting

n1 = 75

00.31 x

s1 = .11

n2 = 64

95.22 x

s2 = .09

Solution 1. Data See the previous example

2. A two-sided test

3. Formulate the hypotheses

The researcher wants to test the hypotheses

H0: (1 - 2) = 0 (i.e., no difference between means of babies)

Ha: (1 - 2) ≠ 0 (i.e., there is a difference between the mean weigh of babies of

fasting and non-fasting mothers)

where 1 = The mean of the baby of a non-fasting mother

2 = The mean of the baby of a fasting mother


5. Specify the rejection region

Figure Rejection region

We are dealing with two-tailed large-sample test in which the symbol "≠" occurs in

Ha, the rejection regions consist of two areas (each equal to 0.025) in the upper or

lower tail of the sampling distribution of the standardized test statistic.

6-7. This test is based on a z statistic. We compute the test statistic as follows:

947.2

64

)09(.

75

)11(.

0)95.200.3()(

22

2

2

2

1

2

1

021

n

s

n

s

Dxxz


i.e. P (z > 2.947) = 1 – P (z ≤ 2.974) = 1 – 0.9984 = 0.0016.

the p-value for this test is 0.0016 + 0.0016 = 0.0032


10. Decision Since the p-value fall within the rejection region, we reject H0, at 0.05

level of , thus we can say that the result is statistically significant.

Z0

α/2 = .025

RejectReject

rejection region rejection region

1 – α = 0.95

α/2 = .025


78

Student's t-distribution

The t-distribution is very much like the normal z-distribution. In particular, both are

symmetric, bell-shaped, and have a mean of 0. However, the distribution of t depends

on a quantity called its degrees of freedom (df), which is equal to (n - 1) when dealing

with one sample t-test, and (n1 + n2 - 2) when dealing with two samples t-test. A

portion of the value of t that located an area of in the upper tail of the t-

distribution for various values of and for degrees of freedom is represented in the

next table.

There are two cases where you must use the t-distribution instead of the Z-

distribution. The first case is where the sample size is small (below 30 or so), and

the second case is when the population standard deviation, σ is not known, and you

have to estimate it using the sample standard deviation, s.

One-sample t-test (n < 30) of hypothesis about a population mean.

ONE-TAILED TEST

H0: = 0

Ha: > 0 (or Ha: < 0)

TWO-TAILED TEST

H0: = 0

Ha: 0

Test statistic: 0

/

xt

s n

where the distribution of t is based on (n – 1) degrees of freedom.

Assumption: The relative frequency distribution of the population from which the

sample was selected is approximately normal.


78

Example 4. Boys of a certain age have a mean weight of 85 lb (pound, 1 kg =

2.20462 pound). An observation was made that in a city neighborhood, children were

underfed. As evidence, a 25 boys in the neighborhood of that age were weighed and

found to have a mean x of 80.94 lb and a standard deviation s of 11.60 lb. Assume

that weights in the population are approximately normally distributed. Do the sample

data provide sufficient evidence for us to conclude that children in neighborhood city

were really underfed?

Solution

Note that the sample size n = 25 is < 30, and unknown. Since the population is

approximately normal we conduct a small-sample t test.


2. A one-sided test


H0: = 85

Ha: < 85


5. The rejection region We are dealing with one-tailed test in which the symbol "<"

occurs in Ha, the rejection region consists of an area = 0.05 in the lower tail of the

sampling distribution. (see the Figure below)



0 80.94 851.75

11.60/

25

xt

s n

8-9. The exact p-value for this test cannot be obtained from Student's t distribution

table. The p-value can be stated as interval, however. In present example, ignoring the

sign of the t value, and entering the student's t distribution table at 24 degrees of

freedom, we find that 1.75 comes between probability values of 0.025 and 0.05, that

is, 0.025 < p-value < 0.05.

t0

Reject H0


78

10. Decision Since the p-value is < α = 0.05, this area does fall within the rejection

region, we reject the null hypothesis and accept the alternative hypothesis at the

significance level of =.05.

Conclusion there is enough evidence to support the underfeeding complaint.

Example 5. Estimations of plasma calcium concentration in the 18 patients with

Everley's syndrome* gave a mean of 3.2 mmol/l, with standard deviation 1.1.

Previous experience from a number of investigations and published reports had shown

that the mean was commonly close to 2.5 mmol/l in healthy people aged 20-44, the

age range of the patients. Is the mean in these patients abnormally high? Assume that

plasma calcium concentration is approximately normally distributed.

Solution

In order to determine whether the mean in these patients abnormally high, we will

conduct a small-sample t test.


2. A one-sided test


H0: = 2.5

Ha: > 2.5


5. The rejection region We are dealing with one-tailed test in which the symbol ">"

occurs in Ha, the rejection region consists of an area of 0.05 in the upper tail of the

sampling distribution. (see the Figure below)


6-7. The value of test statistic is

69.218/1.1

5.22.3

/

0

ns

xt

8-9 Entering the student's t distribution table at 17 degrees of freedom, we find that

2.69 comes between probability values of 0.005 and 0.01, i.e. 0.005 < p-value < 0.01.

10. Decision Since the p-value is < 0.05, this area does fall within the rejection

region, we reject the null hypothesis and accept the alternative hypothesis at the

significance level of =.05.

Conclusion we may conclude that the sample mean is, at least statistically, unusually

high. Whether it should be regarded clinically as abnormally high is something that

needs to be considered separately by the physician in charge of that case.

t0

Reject H0


77

Example 6.

A manufacturer of Glucosamine capsules claims that each capsule contains on the

average 1500 mg of glucosamine. To test this claim a sample of 10 was selected and

amount of glucosamine was measured in each capsule. The sample showed a mean of

1410 mg and standard deviation of 90 mg. Test the hypothesis that the mean amount

of glucosamine has not changed, using a level of significance of = .05. Assume

that the amount of the glucosamine have a distribution that is approximately normal.

Solution

Note that the sample size n = 10 is < 30, and unknown. Since the population is

approximately normal we conduct a small-sample t test.


2. A two-tailed test since the question asks us to test that the mean has not changed,


H0: = 1500

Ha: 1500


5. The rejection region We are dealing with two-tailed test, the rejection region

consists of two areas (each equal to 0.025) in the upper and lower tail of the sampling

distribution.

6. The value of test statistic is

999.210/90

15001410

/

0

ns

xt

7. Ignoring the sign and entering the student's t distribution table at 9 degrees of

freedom, we find that 2.999 comes between probability values of 0.005 and 0.01.

Since we are dealing with two tailed test, the p-value is the total of the area to the left

of the test statistic added to the area to the right of the test statistic. Accordingly,

0.005×2 < p-value < 0.01×2, in other words 0.01 < p-value < 0.02.

8. Decision Since the p-value is < α = 0.05, this area does fall within the rejection

region, we reject H0 and accept Ha at significance level of .05,

9. Conclusion It is therefore unlikely that the sample with mean 1410 came from the

population with mean 1500. We conclude that there is some evidence to suggest that

the mean amount of glucosamine has changed.

t0

α/2 = .025

RejectReject


1 – α = 0.95

α/2 = .025

t0

α/2 = .025

RejectReject


1 – α = 0.95

α/2 = .025


77

Two sample t-test

The two sample t-test is used to compare the means in two independent groups of

observations using representative samples.

The test statistic for two sample t-test is:

1 2

2

1 2

1 1p

x xt

sn n

Where 2

ps is the pooled variance = 2

)1()1(

21

2

22

2

112

nn

snsns p

and the distribution of t is based on (n1 + n2 - 2) degrees of freedom.

Assumptions:

The validity of the two-sample t-test depends on various assumptions being satisfied:

1. Each subject must be randomly selected from the population.

2. The random samples are selected in an independent manner from the two

populations.

3. The populations from which the samples are selected both have approximately

normal distributions.

4. The two samples come from populations with equal (or approximately equal)

variances (or standard deviations).

Example The addition of bran to the diet has been reported to benefit patients with

diverticulosis (disease of the colon). Several different bran preparations are available,

and a clinician wants to test the efficacy of two of them on patients, since favorable

claims have been made for each. Among the consequences of administering bran that

requires testing is the transit time through the alimentary canal.

The clinician selects two groups of patients aged 40-64 with diverticulosis of

comparable severity. Sample 1 contains 12 patients who are given treatment A, and

sample 2 contains 15 patients who are given treatment B. The transit times of food

through the gut are measured by a standard technique with marked pellets and the

results are recorded, in order of increasing time, in the next table.

Table Transit times of marker pellets through the alimentary canal of patients with diverticulosis

on two types of treatment: unpaired comparison

Transit times (h)

Sample 1 (Treatment A) Sample 2 (Treatment B)

52 44

64 51

68 52

74 55

79 60

83 62

84 66

88 68


75

95 69

97 71

101 71

116 76

82

91

108

Total 1001 1026

Mean 83.42 68.40

Does the transit time through the alimentary canal differ in the two groups of

patients taking these two preparations?

Consider the following:

The samples were independently and randomly selected. The populations of transit

times of food at both treatments have approximately normal distributions and their

variances are equal.

Solution

Summary of test statistics

Treatment A Treatment B

n1 = 12 n2 = 15

1x 83.4167 2x 68.4

s1 = 17.63 s2 = 16.47 2

1s = 310.99 2

2s = 271.4


2. A two-sided test

3. Formulate the hypotheses

The researcher wants to test the following hypothesis:

H0: (1 - 2) = 0 (i.e., no difference between mean transit times of food between

treatment A and B)

Ha: (1 - 2) 0 (i.e., mean transit times of food between treatment A and B are

different)

where 1 and 2 are the true mean transit times of food of treatment A and B,

respectively.


5. The rejection region Since it is possible for the difference in mean transit times

for A-B to be positive or negative, we will employ a two sided test. The rejection

region consists of two areas (each equal to 0.025) in the upper and lower tail of

the sampling distribution.


77

6-7. The test statistic is 1 2

2

1 2

1 1p

x xt

sn n

where 2

ps is the pooled variance and equal to 2

)1()1(

21

2

22

2

112

nn

snsns p

67.88221512

)4.271)(115()99.310)(112(

2

)1()1(

21

2

22

2

112

nn

snsns p

Using this pooled sample variance in the computation of the test statistic, we obtain

282.2

15

1

12

167.288

)4.684167.83(

11

)( 0

21

2

021

D

nns

Dxxt

p

8. The exact p-value for this test cannot be obtained from Student's t distribution

table. The p-value can be stated as interval, however. In present example, entering the

student's t distribution table at 25 degrees of freedom {that is (n1 + n2 - 2) = (12 + 15 –

2 = 25)}, we find that 2.282 lies between 2.167 and 2.485. In other words it found

between the probability values of 0.01 and 0.02. Since we are dealing with two tailed

test, the p-value is the total of the area to the left of the test statistic added to the area

to the right of the test statistic. Accordingly, 0.01×2 < p-value < 0.02×2 i.e. 0.02 < p-

value < 0.04. This degree of probability is smaller than the conventional level of 0.05.

9. Decision The null hypothesis that there is no difference between the means is

therefore somewhat unlikely.

t0

α/2 = .025

RejectReject


1 – α = 0.95

α/2 = .025

t0

α/2 = .025

RejectReject


1 – α = 0.95

α/2 = .025

Documents

Hypothesis Testing - Islamic University of Gazasite.iugaza.edu.ps/.../2018/02/11_Hypothesis-testing.pdf · 2018. 2. 10. · Hypothesis Testing Hypothesis testing for a population