30
EDRS 6208 Analysis and Interpretation of Data Non Parametric Tests Chi-square Distribution

Analysis and Interpretation of Data Non Parametric Tests Chi-square Distribution

Embed Size (px)

Citation preview

EDRS 6208

EDRS 6208

Analysis and Interpretation of DataNon Parametric TestsChi-square Distribution

OutlineWhat are nonparametric tests?The Chi Square Test definitionThe Chi-Square DistributionA Goodness-of-fit TestContingency Table

Parametric and Nonparametric TestsParametric Data Measured dataUsed with interval or ratio level dataUsually assumed to be normally or almost normally distributedScores are considered parametric data[z, t-tests, ANOVA, Pearsons]

Non parametric DataCounted nominal or ranked dataDoes not assume a normal distribution of population [ or Spearmans]Nonparametric testsNon parametric tests also known as distribution-free tests. They are appropriate when:The nature of the population distribution from which the sample is drawn is not known to be normal.The variables are expressed in nominal form (they are classified into categories and represented by frequency counts).The variables are expressed on ordinal form (They ranked in order, expressed as first, second, third) (Best and Kahn, 2006, p.433).

THE CHI-SQUARE DISTRIBUTIONDefinition The chi-square distribution has only one parameter called the degrees of freedom. The shape of a chi-squared distribution curve is skewed to the right for small df and becomes symmetric for large df. The entire chi-square distribution curve lies to the right of the vertical axis. The chi-square distribution assumes nonnegative values only, and these are denoted by the symbol 2 (read as chi-square). Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedFigure 11.1 Three chi-square distribution curves. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Example

Find the value of for 7 degrees of freedom and an area of .10 in the right tail of the chi-square distribution curve.

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedChi-Square 2 for df = 7 and .10 Area in the Right Tail Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Figure: Critical Value X2 Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Example 2Find the value of for 12 degrees of freedom and an area of .05 in the left tail of the chi-square distribution curve.

Solution

Area in the right tail = 1 Area in the left tail = 1 .05 = .95

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedTable 11.2 2 for df = 12 and .95 Area in the Right Tail Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Figure Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

A GOODNESS-OF-FIT TESTDefinition An experiment with the following characteristics is called a multinomial experiment.It consists of n identical trials (repetitions).Each trial results in one of k possible outcomes (or categories), where k > 2.The trials are independent.The probabilities of the various outcomes remain constant for each trial. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedA GOODNESS-OF-FIT TESTDefinitionThe frequencies obtained from the performance of an experiment are called the observed frequencies and are denoted by O. The expected frequencies, denoted by E, are the frequencies that we expect to obtain if the null hypothesis is true. The expected frequency for a category is obtained as E = npwhere n is the sample size and p is the probability that an element belongs to that category if the null hypothesis is true. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedA GOODNESS-OF-FIT TESTDegrees of Freedom for a Goodness-of-Fit TestIn a goodness-of-fit test, the degrees of freedom are df = k 1 where k denotes the number of possible outcomes (or categories) for the experiment. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedTest Statistic for a Goodness-of-Fit TestThe test statistic for a goodness-of-fit test is 2 and its value is calculated as

where O = observed frequency for a categoryE = expected frequency for a category = npRemember that a chi-square goodness-of-fit test is always right-tailed.

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedExample A bank has an ATM installed inside the bank, and it is available to its customers only from 7 AM to 6 PM Monday through Friday. The manager of the bank wanted to investigate if the percentage of transactions made on this ATM is the same for each of the 5 days (Monday through Friday) of the week. She randomly selected one week and counted the number of transactions made on this ATM on each of the 5 days during this week. The information she obtained is given in the following table, where the number of users represents the number of transactions on this ATM on these days. For convenience, we will refer to these transactions as people or users. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved At the 1% level of significance, can we reject the null hypothesis that the number of people who use this ATM each of the 5 days of the week is the same? Assume that this week is typical of all weeks in regard to the use of this ATM. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

SolutionStep 1:H0 : p1 = p2 = p3 = p4 = p5 = .20H1 : At least two of the five proportions are not equal to .20Step 2: There are 5 categories 5 days on which the ATM is usedMultinomial experimentWe use the chi-square distribution to make this test.

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved19Step 3:Area in the right tail = = .01k = number of categories = 5df = k 1 = 5 1 = 4The critical value of 2 = 13.277 Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Calculating the Value of the Test Statistic Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Step 4:All the required calculations to find the value of the test statistic 2 are shown in Table 11.3.

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedStep 5:The value of the test statistic 2 = 23.184 is larger than the critical value of 2 = 13.277It falls in the rejection regionHence, we reject the null hypothesisWe state that the number of persons who use this ATM is not the same for the 5 days of the week. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedExample 2In a July 23, 2009, Harris Interactive Poll, 1015 advertisers were asked about their opinions of Twitter. The percentage distribution of their responses is shown in the following table. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Assume that these percentage hold true for the 2009 population of advertisers. Recently 800 randomly selected advertisers were asked the same question. The following table lists the number of advertisers in this sample who gave each response. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Test at the 2.5% level of significance whether the current distribution of opinions is different from that for 2009.Step 1:H0 : The current percentage distribution of opinions is the same as for 2009.H1 : The current percentage distribution of opinions is different from that for 2009.Step 2: There are 4 categories 5 days on opinionMultinomial experimentWe use the chi-square distribution to make this test.

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedStep 3:Area in the right tail = = .025k = number of categories = 4df = k 1 = 4 1 = 3The critical value of 2 = 9.348 Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Calculating the Value of the Test Statistic Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Step 4:All the required calculations to find the value of the test statistic 2 are shown in Table 11.4.

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedSolutionStep 5:The value of the test statistic 2 = 5.420 is smaller than the critical value of 2 = 9.348It falls in the nonrejection regionHence, we fail to reject the null hypothesisWe state that the current percentage distribution of opinions is the same as for 2009. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved