- ONE SAMPLE HYPOTHESIS TESTS - TWO SAMPLE HYPOTHESIS TESTS (INDEPENDENT AND DEPENDENT SAMPLES) 1 June 7, 2012

1

RESITATION- ONE SAMPLE HYPOTHESIS TESTS

- TWO SAMPLE HYPOTHESIS TESTS (INDEPENDENT AND DEPENDENT SAMPLES)

June 7, 2012

2

Example: A researcher believes that mean hemoglobin value is 12 in the population. He selected 64 adults from population randomly to verify this idea and identified hemoglobin mean as 11.2. According to these findings, is population hemoglobin mean different from 12?

Since

The variable hemoglobin is continuous

The sample size is 64

Hemoglobin is normally distributed

There is only one group

One sample t test is used

3

N Mean Std. Dev.

Hemoglobin 64 11.2 2.8

Hypothesis: H0: = 12

Ha: 12

Sample results:

Population mean: 12

Test statistic: Since the population variance is unknown, our test statistic is t

Level of significance: =0.05

4

29.2=648,212–1

=nsμ–x

=t1.2

t(/2,n-1)= t(0.025,64-1)≈2.00

tcal > ttable Reject H0

Population mean is different from 12.

5

Example: To test the median level of energy intake of 2 year old children as 1280 kcal reported in another study, energy intakes of 10 children are calculated. Energy intakes of 10 children are as follows:

Child 1 2 3 4 5 6 7 8 9 10

EnergyIntake

1500 825 1300 1700 970 1200 1110 1270 1460 1090

6

Since

The variable concerning energy intake is continuous

The sample size is not greater than 10

Energy intake is not normally distributed

There is only one group

Sign test

7

H0: The population median is 1280.

HA: The population median is not 1280.

Child 1 2 3 4 5 6 7 8 9 10

Energy intakeSign

1500 825 1300 1700 970 1200 1110 1270 1460 1090

+ +- -+ -- - -+

Number of (-) signs = 6 and number of (+) signs = 4

For k=4 and n=10

From the sign test table p=0.377

8

Since p > 0.05 we accept H0

We conclude that the median energy intake level in 2 year old children is 1280 kcal.

9

Example: The dean of the faculty wants to know whether the smoking ratio among the Phase I students is 0.25 or not. For this purpose, 50 student is selected and 18 of them said they were smoking. According to this results, can we say that this ratio is different from 0.25 at the 0.05 level of significance?

H0: P=0.25

HA: P0.25

.801=50/)75.0)(25.0(

25.0–36.0=

n/PQP–p

=z

Critical z values are 1.96

1.80<1.96 Accept H0. Smoking ratio among the Phase I student is 0.25.

p=18/50=0.36

One sample z test for proportion / one sample chi-square test

10

We can solve this problem with one sample chi square test at the same time.

Smoking Observed Expected (O-E) (O-E)2 (O-E)2/E

Yes 18 12.5 (50*0.25)

5.5 30.25 2.42

No 32 37.5 (50*0.75)

-5.5 30.25 0.81

Total 50 50 0 3.23

23.3=E

)E–O(=χ

2

1=i i

2ii2 ∑ 0

2cal

2)05.0,1( HAccept ⇒ 84.3

11

18 12,5 5,5

32 37,5 -5,5

50

Yes

No

Total

Observed N Expected N Residual

Test Statistics

3,227

1

,072

Chi-Square a

df

Asymp. Sig.

smoking

0 cells (,0%) have expected frequencies less than5. The minimum expected cell frequency is 12,5.

a.

SPSS Output

Example: In a heart study the systolic blood pressure was measured for 24 men aged 20 and for 30 men aged 40. Do these data show sufficient evidence to conclude that the older men have a higher systolic blood pressure, at the 0.05 level of significance?Since

The variable concerning systolic blood pressure is continuous

The sample size of each group is greater than 10

Systolic blood pressure values in each group is normally distributed

There are two groups and they are independent

Independent samples t-test is used

12

13

Subject Sbp Subject Sbp Subject Sbp Subject Sbp1 95 13 132 1 150 16 1482 122 14 100 2 152 17 1163 130 15 120 3 154 18 1284 148 16 125 4 160 19 1365 130 17 115 5 164 20 1106 150 18 138 6 176 21 1267 105 19 100 7 108 22 1308 110 20 118 8 126 23 1229 130 21 136 9 132 24 140

10 156 22 110 10 142 25 11011 108 23 140 11 136 26 12412 124 24 106 12 146 27 136

13 114 28 12014 118 29 14215 130 30 114

20- year-old 40- year-old

14

24 122,8333 16,7790

30 133,6667 17,3013

GROUP 20- year-old

40- year-old

N Mean Std. Deviation

3024N =

GROUP

40- year-old20- year-old

Mea

n

1 S

D S

BP

160

150

140

130

120

110

100

15

(1) H0:1=2

Ha: 1<2

94.1=F<06.1=54.28133.299

=SS

=F )05.0,24,30(2min

2max

(2) Testing the equality of variances

Accept H0. Variances are equal.

H0:21= 2

2

Ha: 21 2

2

16

2–n+ns)1–n(+s)1–n(

=s21

222

2112

p

46.291=2–30+24

33.299)1–30(+54.281)1–24(=

22

31.2=

3046 . 291

+24

46 . 2910–)67 . 133–83 . 122(

=

ns

+ns

)μ–μ(–)x–x(=t

2

2p

1

2p

2121

(3)

(4) t(52,0.05)=1.675 < p<0.05, Reject H0. 31.2calt

(5) The older men have higher systolic blood pressure

17

Group Statistics

24 122,83 16,779

30 133,67 17,301

Group

20 year old

40 year old

SBP

N Mean Std. Deviation

Independent Samples Test

,013 ,910 -2,317 52 ,024 -10,833 4,675 -20,215 -1,451

-2,325 50,049 ,024 -10,833 4,659 -20,191 -1,475

Equal variances assumed

Equal variances notassumed

SBP

F Sig.

Levene's Test forEquality of Variances

t df Sig. (2-tailed)Mean

DifferenceStd. ErrorDifference Lower Upper

95% ConfidenceInterval of the Difference

t-test for Equality of Means

SPSS Output

18

Example: Cryosurgery is a commonly used therapy for treatment of cervical intraepithelial neoplasia (CIN). The procedure is associated with pain and uterine cramping.

Within 10 min of completing the cryosurgical procedure, the intensity of pain and cramping were assessed on a 100-mm visual analog scale (VAS), in which 0 represent no pain or cramping and 100 represent the most severe pain and cramping.

The purpose of study was to compare the perceptions of both pain and cramping in women undergoing the procedure with and without paracervical block.

19

5 women were selected randomly in each groups and their scores are as follows:

Group Score

Women without a block

148837270

Women with a paracervical block

5070376675

20

Since

The variable concerning pain/cramping score is continuous

The sample size is less than 10

There are two groups and they are independent

Mann Whitney U test

21

IIIA

III

H

H

≠:

:0

Group Score RankI 0 1I 14 2I 27 3I 37 4.5II 37 4.5II 50 6II 66 7II 70 8II 75 9I 88 10

R1= 1+2+3+4.5+10 = 20.5

5.19=5.20–2

)1+5(5+5×5=

R–2

)1+n(n+nn=U 1

11211

From the table, critical value is 21

19.5 < 21 accept H0

5.5=5.19–5×5=U–nn=U 1212

5.19=U

We conclude that the median pain/ cramping scores are same in two groups.

22

Ranks

5 4,10 20,50

5 6,90 34,50

10

Group2

1

2

Total

VAS

N Mean Rank Sum of Ranks

Test Statisticsb

5,500

20,500

-1,467

,142

,151a

Mann-Whitney U

Wilcoxon W

Z

Asymp. Sig. (2-tailed)

Exact Sig. [2*(1-tailed Sig.)]

VAS

Not corrected for ties.a.

Grouping Variable: Group2b.

Descriptive Statistics

VAS

5 0 33,20 27,00 33,641 0 88 7,00 27,00 62,50

5 0 59,60 66,00 15,726 37 75 43,50 66,00 72,50

Group

1

2

Valid Missing

N

Mean Median Std. Deviation Minimum Maximum 25 50 75

Percentiles

SPSS Output

23

GroupCervical blockWithout block

VAS

100

80

60

40

20

0

2

24

Example: We want to know if children in two geographic areas differ with respect to the proportion who are anemic. A sample of one-year-old children seen in a certain group of county health departments during a year was selected from each of the geographic areas composing the departments’ clientele. The followig information regarding anemia was revealed.

Geographic Area

Number in sample

Number anemic

Proportion

1 450 105 0.23

2 375 120 0.32

The difference between two population proportion

25

0P–P:H0=P–P:H

12a

120

≠

27.0=375+450

)32.0)(375(+)23.0)(450(=p

32.0=375/120=p23.0=450/105=p

2

1

0.025<0.0027=p 78.2=

375)73.0)(27.0(

+450

)73.0)(27.0(0–0.32)–.230(

=z

Reject H0

We concluded that the proportion of anemia is different in two geographic areas.

26

Example: A study was conducted to analyze the relation between coronary heart disease (CHD) and smoking. 40 patients with CHD and 50 control subjects were randomly selected from the records and smoking habits of these subjects were examined. Observed values are as follows:

+ -

Yes

No

Total 90

SmokingTotalCHD

30

4 46

14 76

40

50

10

27

Observed and expected frequencies

+ -

Yes

No

Total 90

SmokingTotal

CHD

30

4 46

14 76

40

50

10 6.2 33.8

7.8 42.2

( ) ( ) ( ) ( )95.4=

2.422.42–46

+8.7

8.7–4+

8.338.33–30

+2.6

2.6–10=

E)E–(O

=χ

2222

2

1=i

2

1=j ij

2ijij2 ∑∑

28

df = (r-1)(c-1)=(2-1)(2-1)=1

2(1,0.05)=3.841

Conclusion: There is a relation between CHD and smoking.

2 =4. 95 > reject H0

29

SPSS Output

CHD * Smoking Crosstabulation

10 30 40

6,2 33,8 40,0

4 46 50

7,8 42,2 50,0

14 76 90

14,0 76,0 90,0

Count

Expected Count

Count

Expected Count

Count

Expected Count

Yes

No

CHD

Total

Yes No

Smoking

Total

Chi-Square Tests

4,889b 1 ,027

3,681 1 ,055

4,937 1 ,026

,040 ,027

4,835 1 ,028

90

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

Linear-by-LinearAssociation

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (,0%) have expected count less than 5. The minimum expected count is6,22.

b.

30

Example: A study was conducted to see if a new therapeutic procedure is more effective than the standard treatment in improving the digital dexterity of certain handicapped persons.

Twenty-four pairs of twins were used in the study, one of the twins was randomly assigned to receive the new treatment, while the other received the standard therapy. At the end of the experimental period each individual was given a digital dexterity test with scores as follows.

31

Since

The variable concerning digital dexterity test scores is continuous

The sample size is greater than 10

digital dexterity test score is normally distributed

There are two groups and they are dependent

Paired sample t-test

32

New Standard Difference49 54 -556 42 1470 63 783 77 683 83 068 51 1784 82 263 54 967 62 579 71 888 82 648 50 -252 41 1173 67 652 57 -573 70 378 72 664 62 271 64 742 44 -251 44 756 42 1440 35 581 73 8

Total 129Mean 65,46 60,08 5,38SD 14,38 14,46 5,65

H0: d = 0

Ha: d > 0

38.5=24

129=

n

d=d i∑

90.31=1–n

)d–di(=s

22d

∑

66.4= 24/90.310–38.5

=n/s

μ–d=t

d

d

t(23,0.05)=1.714

We conclude that the new treatment is effective.

Since, reject H0.tablecalculated t>t

33

Paired Samples Statistics

65,46 24 14,380 2,935

60,08 24 14,461 2,952

New

Standard

Pair 1

Mean N Std. DeviationStd. Error

Mean

Paired Samples Test

5,375 5,648 1,153 2,990 7,760 4,662 23 ,000New - StandardPair 1

Mean Std. DeviationStd. Error

Mean Lower Upper

95% ConfidenceInterval of the Difference

Paired Differences

t df Sig. (2-tailed)

SPSS Output

34

StandardNew

Dig

ital D

exte

rity

Test

Sco

res

(Mea

n±SD

)

80

70

60

50

40

35

Example: To test whether the weight-reducing diet is effective 9 persons were selected. These persons stayed on a diet for two months and their weights were measured before and after diet. The following are the weights in kg:

SubjectWeights

Before After

1 85 822 91 923 68 624 76 735 82 816 87 837 105 858 93 889 98 90

Since

The variable concerning weight is continous.

The sample size is less than 10

There are two groups and they are dependent

Wilcoxon signed ranks test

36

SubjectWeights Difference

Di

SortedDi Rank

SignedRankBefore After

1 85 82 3 -1 1.5 -1.52 91 92 -1 1 1.5 1.53 68 62 6 3 3.5 3.54 76 73 3 3 3.5 3.55 82 81 1 4 5 56 87 83 4 5 6 67 105 85 20 6 7 78 93 88 5 8 8 89 98 90 8 20 9 9

37

T = 1.5

reject H0 , p<0.05T = 1.5 < T(n=9,a =0.05) = 6

We conclude that the diet is effective.

38

AfterBefore

110

100

90

80

70

60

3

39

Example: 35 patients were evaluated for arrhythmia with two different medical devices. Is there any statistically significant difference between the diagnose of two devices?

Device IDevice II

Total

Arrhythmia (+) Arrhythmia (-)

Arrhythmia (+) 10 3 13

Arrhythmia (-) 13 9 22

Total 23 12 35

The significance test for the difference between two dependent population / McNemar test

40

H0: P1=P2

Ha: P1 P2

25.2=13+3

1––3=

c+b1–c–b

=z

13

Critical z value is ±1.96 Reject H0

41

McNemar test approach:

c+b)c–b(

=χ2

2

c+b)1–c–b(

=χ2

2

1.5=13+3

)1–13–3(=

c+b)1–c–b(

=χ22

2

2(1,0.05)=3.841<5.1 p<0.05; reject H0.

42

Device Arrhythmia Crosstabulation

Count

10 3 13

13 9 22

23 12 35

Yes

No

DeviceI

Total

Yes No

DeviceII

Total

Chi-Square Tests

,021a

35

McNemar Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Binomial distribution used.a.

SPSS Output

Documents

- ONE SAMPLE HYPOTHESIS TESTS - TWO SAMPLE HYPOTHESIS TESTS (INDEPENDENT AND DEPENDENT SAMPLES) 1 June 7, 2012