Upload
sitalcoolk
View
231
Download
0
Embed Size (px)
Citation preview
7/31/2019 Chi Square T Test
1/17
TYPES OF STATISTICAL TESTS & THEIR
ASSUMPTIONS
1.) PARAMETRIC TESTS Based on assumptions made concerning the parameters
of the population from which the sample was drawn
The validity of these tests depend whether theassumptions about the nature of the sampled populationfrom which the sample was drawn
Usual assumptions include:1.) Random selection of the sample
2.) Normal distribution of the population from which thesamples were drawn3.) Equality of variances (homoscedasticity) when morethan one population is sampled.Other assumptions:If data is numerical and measured in either interval orratio scale
2.) NON-PARAMETRIC TESTS Less stringent (strict) assumptions
No assumptions are made about the populationparameters
Distribution-free tests
THE CHI-SQUARE (2)TEST OF HOMOGENEITY
1
7/31/2019 Chi Square T Test
2/17
A commonly used statistical test
Compares the observed frequency of elements falling indifferent categories with the expected frequency if the nullhypothesis was true
Types of 2test:1.) Chi-square Goodness of Fit Test2.) Chi-square Test of Association3.) Chi-square Test of Homogeneity
Uses:
Chi-square test of homogeneity- is used when we wish to find out whether two ormore populations have the same proportions for thedifferent categories of another variable
Data Lay-out
- use a 2 x 2 contingency table ( a cross tabulationof 2 variables)- The rows represent the categories of one variable andthe columns represent the categories of another variable
- rows are designated as r- columns are designated as c
2
7/31/2019 Chi Square T Test
3/17
Table-1 Distribution of Subjects by Place of Residenceand Blood Pressure Status
Blood Pressure StatusPlace of
ResidenceNormotensive Hypertensive Total
Alfonso 348 62 410
Magallanes 368 46 414
Total 716 108 824
2 Test homogeneity assumes that categories are collectivelyexhaustive and mutually exclusive. Samples are presumed tobe independent of one another.
Hypothesis Testing Procedure:
Step 1: Statement of the Hypotheses
Ho: The proportion of elements falling in each category ofthe variable of interest is the same for all groups.
H1: There are differences between groups in the proportionof elements falling in each category of the variable
Step 2: Setting the level of significance.-Arbitrarily set at .05 or .01
Step 3: Determination of the test statistic
- 2 test is the test statistic which follows a chi-squaredistribution
- the shape of the chi-square distribution is dependenton the degrees of freedom (df)
where df = (row 1) (column 1)Step 4: Determine the critical region (C.R.)
3
7/31/2019 Chi Square T Test
4/17
The critical region of the 2distribution is determined bythe degrees of freedom and the level of significance.
(See chi-square distribution table)
Step 5: Computations of the Chi-square statistic
Formula:
=E
EO2
2 )(
wheretotalgrand
totalcolumnxtotalrowE=
For Chi-square to be applicable, all the Es must be > 5,otherwise, the Fisher Exact Test will be used
Step 6: Statistical DecisionReject Ho if the computed value of Chi-square falls in
the critical region . Otherwise, do not reject Ho.
Step 7: Drawing ConclusionsThis depends on the statistical decision
Sample Problem for Chi-square Test of Homogeneity:
4
7/31/2019 Chi Square T Test
5/17
Step 1: State the Hypothesis
Ho: P1 = P2
The prevalence proportion of hypertension in Alfonso is equalto the prevalence or proportion in Magallanes
H1: P1 P2
Step 2: Level of significance = .05
Step 3: Test statistic = Chi-square test of homogeneity
=E
EO2
2 )(
Step 4: Critical region (C.R.): = .05
df = (row 1) (column 1)
= (2-1) (2-1)
= 1
Look at .05, with 1 degree of freedom in the Chi-
square distribution table this corresponds to
3.84.
Therefore the C.R. =
2
observed>
2
.05, 1 df = 3.84
5
7/31/2019 Chi Square T Test
6/17
Blood Pressure StatusPlace of
ResidenceNormotensive Hypertensive Total
Alfonso 348 62 410
Magallanes 368 46 414
Total 716 108 824
Step 5: Computations
Oij Eij (Oij Eij) (Oij Eij)2 (Oij Eij)2Eij
3486236846
356.353.7
359.754.3
-8.38.38.3-8.3
68.968.968.968.9
0.191.280.191.27
Remember:
totalgrandtotalcolumnxtotalrowEfrequencyExpected =)(
E = 410 x 716 = 356.3824
E = 410 x 108 = 53.7824
E = 414 x 716 = 359.7824
E = 414 x 108 = 54.3824
6
2 = 0.19 + 1.28 + 0.19 + 1.27= 2.93
7/31/2019 Chi Square T Test
7/17
Step 6: Statistical Decision:
2
2.93 3.84
Since 2 calculated = 2.93 is < than 3.84, therefore DONOT Reject Ho
The value 2.93 falls in the area of non-rejection (seediagram).
Step 7 Conclusion:
There is no sufficient evidence to conclude that the
prevalence of hypertension in Alfonso differs from the
prevalence in Magallanes.
7
AREA OF REJECTIONOR CRITICAL REGION
AREA OFNON-REJECTION
7/31/2019 Chi Square T Test
8/17
REVIEW OF THE CRITICAL REGION
Critical Region (C.R.) or Region of Rejection
Set of values of the test statistic which leads to the
rejection of null hypothesis
These values are those whose probability of occurrence
is less than or equal to the level of significance
They are found at the tail end of the distribution
The values whose probability of occurrence is greater
than or equal to comprise the region of non-rejection
The size of the CR is determined by the
The location of the CR is determined by the nature of the
alternative hypothesis, whether it is one-tailed or two-
tailed
See diagrams of the CR at differing alpha levels and
direction of the alternative hypothesis.
STATISTICALLY & NON-STATISTICALLY SIGNIFICANTRESULTS
When the null hypothesis is rejected, the results are said to bestatistically significant and the observed difference between theobserved and expected is not attributed to sampling variation
8
7/31/2019 Chi Square T Test
9/17
When the null hypothesis is not rejected, the results are said tobe non-statistically significant and sampling variation is a likelyexplanation of the observed difference
TESTING THEDIFFERENCE IN MEAN VALUES BETWEENTWO INDEPENDENT GROUPS
The T- test for Independent Samples
Independent the two groups each stand as oneand are mutually exclusive
E.g. Group A is all maleGroup B is all female
Treatment arm = 14 studentsControl arm = 15 students
Statistical Assumptions of the T-test
1.) The observations in each group follow a normal
distribution.
2.) Sample size of each group is at least 30
3.) The standard deviation (variance) in the two samples isequal (homogeneity of variance)
4.) The values observed in one group has nothing to do with
the observations of the other group (independence)
9
7/31/2019 Chi Square T Test
10/17
Sample Problem:
In an experiment 45 women were randomized to receiveparacervical block prior to cryosurgery while another 39received no paracervical block. The mean pain score in thetreatment arm was 35.60 while the control arm is 51.41points. (See table below)
Is the mean pain score in the treatment arm significantly lowerthan that observed in the control arm?
Variable Group N MeanScore
SD SE of themean
Totalcramping
scoreNo block
Block3945
51.4135.6
28.1128.45
4.504.24
Step 1: State the null hypothesis:
Ho: Women who had a paracervical block prior tocryosurgery had a mean cramping score of atleast as high as women who had no block.
Ha: Women who had a paracervical block prior to
cryosurgery had a lower mean cramping scorethan women who had no block.
In symbols:_ _
Ho: X1 = X2
10
7/31/2019 Chi Square T Test
11/17
_ _Ha: X1 < X2 (one-tailed test)
Step 2: Let us use =.01(there will be only 1 chance in 100 that we will incorrectlyconclude that cramping is less with cryotherapy if it really isnot.
Step 3: Test statistic will be T-test for independent samples(Assuming that the observations follow a normal distribution,the SD are equal and the observations are independent)
[ ])/1()/1(
)(
21
__
21
__
)2( 21 nnSD
xxt
p
nn
+
=
+
where SDp= is the pooled standard deviation computed usingthe formula:
2
)1()1(
21
2
22
2
11
+
+=
nn
SDnSDnSD
p
Step 4: Determine the critical region
The degrees of freedom is : df = (n1 + n2 - 2)df = (45 + 39 -2)df = 82
11
7/31/2019 Chi Square T Test
12/17
Critical region for df= 82 (use 60) and alpha =.01 at 1-tailedtest is equal to -2.39
Reject null if the observed value of t is < -2.39
Step 5: Computations
[ ])39/1()45/1(27.28)4.516.35(
21
)23945(
+
=
+t
= - 2.56
Step 6: Statistical Decision
The calculated or observed t is 2.56 which is less than the
critical value of 2.39, so we reject the null hypothesis
Step 7 Conclusion
In this study, on the average, women who had a paracervical
block prior to cryosurgery experienced less total cramping
than women who did not have the block.
Note:
12
7/31/2019 Chi Square T Test
13/17
The conclusion refers to women on the average and does not
mean that every woman with a paracervical block would
experience less cramping.
TESTING THEDIFFERENCE IN MEAN VALUES BETWEENTWO DEPENDENT GROUPS
(OR PAIRED SAMPLES)
Means when the Same Group is Measured Twice
paired designs or repeated-measures designs
before and after measurements
the researcher asks whether the intervention makes a
difference or not (whether there is a change)
test statistic used is the paired samples T-test or the
matched groups T-test or the dependent group T-test
Sample Case Problem:
Fifty-one (51) patients undergoing cholecystectomy were
evaluated before, 1 month after, and 3 months after
cholecystectomy. Patients were interviewed about the qualityand frequency of their stools. In addition, to evaluate the role
of bile acid malabsorption, serum concentrations of 7-alpha-
13
7/31/2019 Chi Square T Test
14/17
hydroxy-4-cholesten-3-one (7--HCO) were measured before
and after surgery. The data given were as follows:
Serum value Before CHE After CHE(1 month)
7--HCO (ng/mL) 25.33 13.51 46.55 29.58
Question:Is there a true difference in the 7--HCO before and after
surgery, so that one can conclude that indeed surgery isbeneficial?
Step 1: State the Hypothesis
Ho: The true difference of 7--HCO is zero
Ha: The true difference of 7--HCO is not zero
In symbols:
Ho: = 0 (the symbol delta stands for difference in thepopulation)
Ha: 0 (2-tailed test) (if 7--HCO significantly increasesor decreases)
Step 2: Level of significance = .01
Step 3: Test-statistic is paired samples T-test
14
7/31/2019 Chi Square T Test
15/17
Assuming that the data is in the interval or ratio scale, thedifferences are normally distributed.
Step 4: Determine the critical region
The value oft that must be attained to be declared significant. Thevalue of T that divides the distribution into the central 99% is byinterpolation, 2.682, with 0.5% of the area in each tail with n-1degrees of freedom.
df = n-1df = 51-1 or 50
C.R.= Reject null hypothesis that the program does not makea difference if the value of the t-statistic is < than -2.682 orgreater than + 2.682
Step 5 Computations:
nSD
dt
d/
0__
=
_where d = is the mean difference 0 = is the assumed difference in the population nS
d/ = standard error of the mean difference
1)( 2
___
=
n
ddSD
d
wheren
dd
=
___
SDd= 26.68
Substituting = 68.574.322.21
5168.26033.2555.46
==
=t
15
7/31/2019 Chi Square T Test
16/17
Step 6: Statistical DecisionSince the observed value oftis 5.68 larger than
the critical value 2.682, we reject the null hypothesis.
Step 7: Conclusion:Mean serum values for 7--HCO are NOT the same beforecholecystectomy and 1 month later (p-value 5 cms in size.
4.) A trialist wants to know if the a drug is effective indiminishing psychosis in the emergency room in a non-randomized study (quasi-experimental study design).
16
7/31/2019 Chi Square T Test
17/17
5.) Dr. Lin wants to know if the proportion of midwives andnurses who scored above 60 points and below 60 points differamong the 15 nurses and 15 midwives .
17