2
Mann-Whitney Test If X1, X2, … Xm is a sample of size m from a population, and Y1, Y2, … Yn is a sample of size n from another population, then the Mann-Whitney statistic, U, is the number of (Xi,Yj) pairs for which Xi<Yj. U is used to test the same null hypothesis of equal population distributions and p-values are given in Table A4. Go over Example 2.6.1 - use R First note that W (sum of ranks, the Wilcoxon statistic) and U (the Mann-Whitney statistic) are equivalent: Let W2=sum of ranks of the Y's and note that for any Yj, R(Yj)=# of X's <= Yj + #of Y's <= Yj So W2= Sum over all j's{R(Yj)} =sum over all j's{# of X's <= Yj + #of Y's <= Yj } =U + (1+2+3+…+n) , since we may assume the Y's are ordered =U+n(n+1)/2 NOTE: In R, the W given in wilcox.test is = U (or m*n - U ) If there are ties, then U=#pairs of X<=Y + 1/2*(#ties)

Mann-Whitney Test

  • Upload
    cicada

  • View
    27

  • Download
    2

Embed Size (px)

DESCRIPTION

Mann-Whitney Test. - PowerPoint PPT Presentation

Citation preview

Page 1: Mann-Whitney  Test

Mann-Whitney Test

• If X1, X2, … Xm is a sample of size m from a population, and Y1, Y2, … Yn is a sample of size n from another population, then the Mann-Whitney statistic, U, is the number of (Xi,Yj) pairs for which Xi<Yj. U is used to test the same null hypothesis of equal population distributions and p-values are given in Table A4.

• Go over Example 2.6.1 - use R• First note that W (sum of ranks, the Wilcoxon

statistic) and U (the Mann-Whitney statistic) are equivalent:– Let W2=sum of ranks of the Y's and note that for any Yj,

R(Yj)=# of X's <= Yj + #of Y's <= Yj

– So W2= Sum over all j's{R(Yj)}

=sum over all j's{# of X's <= Yj + #of Y's <= Yj }

=U + (1+2+3+…+n) , since we may assume the Y's are ordered

=U+n(n+1)/2

• NOTE: In R, the W given in wilcox.test is = U

(or m*n - U )• If there are ties, then U=#pairs of X<=Y + 1/2*(#ties)

Page 2: Mann-Whitney  Test

• Review the relationship between a confidence interval and hypothesis testing in the parametric case…

• If the null hypothesis of F1=F2 is rejected then one possible alternative is the so-called shift alternative: F1(x) = F2(x-) where can be thought of as the difference in means (or medians)… do a sketch.

• A consequence of the truth of the shift alternative is that X and Y+ have the same distribution. So think of as the difference of the medians - we'll use the Hodges-Lehmann estimate of : H-L=median(all pwds)

• The 95% confidence interval for is found by:– get all the pwds of X-Y

– arrange them from smallest to largest

– find 2 numbers ka, and kb s.t.

P(ka <= U < kb ) = P(pwds(ka ) < <= pwds(kb )) = as close to the level of confidence as possible

– for large samples, we'll use the normal approximation (more on this later)

• Now go over Example 2.6.2 on page 46ff - use the R code in R4.doc…

• Redo problem #4 on page 73 - get a 95% CI for and its H-L estimate. Use R.