Upload
sudhir-kale
View
221
Download
0
Embed Size (px)
Citation preview
8/6/2019 Calculation for Chi-Sqr, T-test
http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 1/7
Chi-Square Test
The Chi-Square Test procedure tabulates a variable into categories and computes a chi-square statistic. This goodness-of-fit test compares the observed and expected frequencies
in each category to test either that all categories contain the same proportion of values or that each category contains a user-specified proportion of values.
Examples. The chi-square test could be used to determine if a bag of jelly beans containsequal proportions of blue, brown, green, orange, red, and yellow candies. You could also
test to see if a bag of jelly beans contains 5% blue, 30% brown, 10% green, 20% orange,
15% red, and 15% yellow candies.
Statistics. Mean, standard deviation, minimum, maximum, and quartiles. The number andthe percentage of nonmissing and missing cases, the number of cases observed and
expected for each category, residuals, and the chi-square statistic.
Calculation for Chi-Squire Value from Cross Tabulation:
(f 0 – f e))2
X2 =∑------------ (Chi-Square)
f eWhere:
f 0 is the expected frequency in category
f e is observed frequency in each category
882-271 950-400 1793-739
= ---------- +------------+…….. +-----------
271 400 793
=38.71 (Check the value for chi-square distributions table from
any statistical book) level of significant with 6 degree of freedom.
Years Complaints
TotalReceived solved
April-06 to March-07 611 271 882
69.3% 30.7% 100.0%
April-05 to March-06 550 400 950
57.9% 42.1% 100.0%
April-04 to March05 702 400 1,102
8/6/2019 Calculation for Chi-Sqr, T-test
http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 2/7
63.7% 36.3% 100.0%
April 1996 to March 1997 1,775 1,153 2,928
60.6% 39.4% 100.0%
April 95 to March 1996 1,772 1,100 2,872
61.7% 38.3% 100.0%
April 1994 to March 1995 1,355 880 2,235
60.6% 39.4% 100.0%
April 1998 to March 1999 1,055 738 1,793
58.8% 41.2% 100.0%
Total 7,820 4,942 12,762
61.3% 38.7% 100.0%
Value df Asymp. Sig.
(2-sided)
Pearson Chi-Square 36.718 6 0.000
Paired-Samples T Test: Related
Paired-Samples T Test
The Paired-Samples T Test procedure compares the means of two variables for a single
group. It computes the differences between values of the two variables for each case andtests whether the average differs from 0.
Example. In a study on high blood pressure, all patients are measured at the beginning of the study, given a treatment, and measured again. Thus, each subject has two measures,
often called before and after measures. An alternative design for which this test is used isa matched-pairs or case-control study. Here, each record in the data file contains the
response for the patient and also for his or her matched control subject. In a blood
pressure study, patients and controls might be matched by age (a 75-year-old patient witha 75-year-old control group member).
Statistics. For each variable: mean, sample size, standard deviation, and standard error of
the mean. For each pair of variables: correlation, average difference in means, t test, and
confidence interval for mean difference (you can specify the confidence level). Standard
deviation and standard error of the mean difference.
Procedures
To test a sample mean from one group of cases against that from another group of cases,use the Independent-Samples T Test. If you want to compare a sample mean against a
constant value, use the One-Sample T Test. If the data in the test variable are not
8/6/2019 Calculation for Chi-Sqr, T-test
http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 3/7
quantitative, but ordered, or are not normally distributed, use the Wilcoxon signed-rank
test (select 2 Related Samples from the Nonparametric Tests submenu).
Data. For each paired test, specify two quantitative variables (interval- or ratio-level of measurement). For a matched- pairs or case-control study, the response for each test
subject and its matched control subject must be in the same case in the data file.
Assumptions. Observations for each pair should be made under the same conditions. The
mean differences should be normally distributed. Variances of each variable can be equalor unequal.
Independent-Samples Group T Test
The Independent-Samples T Test procedure compares means for two groups of cases.
Ideally, for this test, the subjects should be randomly assigned to two groups, so that any
difference in response is due to the treatment (or lack of treatment) and not to other factors. This is not the case if you compare average income for males and females. A
person is not randomly assigned to be a male or female. In such situations, you shouldensure that differences in other factors are not masking or enhancing a significant
difference in means. Differences in average income may be influenced by factors such as
education and not by sex alone.
Example. Patients with high blood pressure are randomly assigned to a placebo group anda treatment group. The placebo subjects receive an inactive pill and the treatment subjects
receive a new drug that is expected to lower blood pressure. After treating the subjects for
two months, the two-sample t test is used to compare the average blood pressures for the
placebo group and the treatment group. Each patient is measured once and belongs to onegroup.
Statistics. For each variable: sample size, mean, standard deviation, and standard error of
the mean. For the difference in means: mean, standard error, and confidence interval (youcan specify the confidence level). Tests: Levene's test for equality of variances, and both
pooled- and separate-variances t tests for equality of means.
Correlation Analysis
• The sample correlation coefficient (r ) measures the degree of linearity in the relationship between X and Y .
-1 < r < +1
• r = 0 indicates no linear relationship
8/6/2019 Calculation for Chi-Sqr, T-test
http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 4/7
• In Excel, use =CORREL(array1,array2),
where array1 is the range for X and array2 is the range for Y .
Paired Sample T-Test
Paired sample t-test is a statistical technique that is used to compare two population
means in the case of two samples that are correlated. Paired sample t-test is used in‘before after’ studies, or when the samples are the matched pairs, or the case is a control
study. For example, if we give training to a company employee and we want to knowwhether or not the training had any impact on the efficiency of the employee, we could
use the paired sample test. We collect data from the employee on a seven scale rating,
before the training and after the training. By using the paired sample t-test, we canstatistically conclude whether or not training has improved the efficiency of the
employee. In medicine, by using the paired sample t-test, we can figure out whether or
not a particular medicine will cure the illness.
Assumptions in Paired sample t-test:
1.The first assumption in the paired sample t–test is that only the matched pair can be
used to perform the paired sample t-test.2.In the paired sample t-test, normal distributions are assumed.
3.Variance in paired sample t-test: In a paired sample t-test, it is assumed that the
variance of two samples is same.4.Independence of observation in paired sample t-test: In a paired sample t-test,
observations must be independent of each other.
8/6/2019 Calculation for Chi-Sqr, T-test
http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 5/7
Steps in the calculation of paired sample t-test:
1.Set up hypothesis: To calculate the paired sample t-test, first we have to set up thehypothesis. In a paired sample t-test, we set up two hypotheses. The first is null
hypothesis, which assumes that the mean of two paired samples are equal. The second
hypothesis in the paired sample t-test will be an alternative hypothesis, which assumesthat the means of two paired samples are not equal.
2.Select the level of significance: In paired sample t-test, after making the hypothesis,
we choose the level of significance. In most of the cases in the paired sample t-test,significance level is 5%, but in medicine, the significance level is set up at 1%.
3.Calculate the parameter: To calculate the parameter we will use the following
formula for the paired sample t-test:
Where d bar is the mean difference between two samples, s² is the sample variance, n is
the sample size and t is a paired sample t-test with n-1 degrees of freedom.
An alternate formula for paired sample t-test is:
4.Testing of hypothesis or decision making: After calculating the parameter, we will
compare the calculated value with the table value. If the calculated value is greater than
the table value, then we will reject the null hypothesis for the paired sample t-test. If thecalculated value is less than the table value, then we will accept the null hypothesis in the
paired sample t-test and say that there is no significant mean difference between the two
paired samples in the paired sample t-test.
Paired sample t-test in SPSS:
Most statistical software performs this paired sample t-test. In SPSS, paired sample t-test
is available under “analysis” in the menu option, and then in the “compare means”option. As we click on “paired sample t-test,” the following window will appear in SPSS:
8/6/2019 Calculation for Chi-Sqr, T-test
http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 6/7
Now, from the left side, we will select the first paired variable and drag it into the pairedvariables option, variable1, and then select the second paired variable and drag it in to the
second variable place. From the “option” menu, we will select the “confidence interval”
and then click on the “ok” button. After clicking the ok button, the result window willshow the result for the paired sample t-test. The first two tables in SPSS for the paired
sample t-test will show the descriptive statistics and the correlation between the paired
variable. The next table will show the value of the paired sample t-test associated with
their probability value. The table will look like the following table:
This table shows a paired sample t-test value associated with the p value and other
statistics. By using the p value, we can make decisions about the pair of the sample mean.
For instance, this table shows the P value for the pair BI and BI2— and their probabilityvalue is .896, which is greater than the significance level at 5%. In this example of the
paired sample t-test, paired sample means are insignificant, or the mean of the two paired
samples are equal.
he most familiar measure of dependence between two quantities is the Pearson product-moment correlation coefficient, or "Pearson's correlation." It is obtained by dividing the
covariance of the two variables by the product of their standard deviations. Karl Pearson
developed the coefficient from a similar but slightly different idea by Francis Galton.[4]
The population correlation coefficient ρ X,Y between two random variables X and Y withexpected values μ X and μY and standard deviations σ X and σY is defined as:
8/6/2019 Calculation for Chi-Sqr, T-test
http://slidepdf.com/reader/full/calculation-for-chi-sqr-t-test 7/7
where E is the expected value operator, cov means covariance, and, corr a widely used
alternative notation for Pearson's correlation.
The Pearson correlation is defined only if both of the standard deviations are finite and both of them are nonzero. It is a corollary of the Cauchy–Schwarz inequality that the
correlation cannot exceed 1 in absolute value. The correlation coefficient is symmetric:corr( X ,Y ) = corr(Y , X ).
The Pearson correlation is +1 in the case of a perfect positive (increasing) linear relationship (correlation), −1 in the case of a perfect decreasing (negative) linear
relationship (anticorrelation) [5], and some value between −1 and 1 in all other cases,
indicating the degree of linear dependence between the variables. As it approaches zero
there is less of a relationship (closer to uncorrelated). The closer the coefficient is toeither −1 or 1, the stronger the correlation between the variables.
If the variables are independent, Pearson's correlation coefficient is 0, but the converse is
not true because the correlation coefficient detects only linear dependencies between twovariables. For example, suppose the random variable X is symmetrically distributed about
zero, and Y = X 2. Then Y is completely determined by X , so that X and Y are perfectly
dependent, but their correlation is zero; they are uncorrelated. However, in the special
case when X and Y are jointly normal, uncorrelatedness is equivalent to independence.
If we have a series of n measurements of X and Y written as xi and yi where i = 1, 2, ..., n,
then the sample correlation coefficient can be used to estimate the population Pearson
correlation r between X and Y . The sample correlation coefficient is written
where x and y are the sample means of X and Y , and s x and s y are the sample standard
deviations of X and Y .
This can also be written as: