Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
BIO5312 Biostatistics BIO5312 Biostatistics R Session 11: R Session 11: MultisampleMultisample Hypothesis Hypothesis
Testing II Testing II
Dr. Junchao Xia
Center of Biophysics and Computational Biology
Fall 2016
11/8/2016 1 /15
Generating Box Plots for Different GroupsGenerating Box Plots for Different Groups
Loading data and building 3 different groups # set work directory > setwd("C:/Users/Junchao/Desktop/Biostatistics_5312/2016/lab_11") # read data from the data file >lead = read.table("LEAD.DAT.txt",header=T) # remove individuals with missings 99 >ids=lead$maxfwt!=99 # get the maximum number of finger-wrist tapping test >fwt = lead$maxfwt[ids] # obtain a factor using the group IDs >grp = factor(lead$lead_grp[ids]) 11/8/2016 2 /15
Box Plots from Different GroupsBox Plots from Different Groups
# generate the boxplots for three different groups
>boxplot(fwt~grp,xlab="Lead group",ylab="fwt")
11/8/2016 3 /15
Fit a Multiple Linear ModelFit a Multiple Linear Model # fit a multiple linear model
>fit1=lm(fwt~grp)
>summary(fit1) Call:
lm(formula = fwt ~ grp)
Residuals:
Min 1Q Median 3Q Max
-41.438 - 5.750 0.000 7.531 31.500
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.438 1.539 35.370 < 2e-16 ***
Grp2 -10.438 3.217 -3.245 0.00162 **
grp3 -2.938 3.441 -0.854 0.39548
Residual standard error: 12.31 on 96 degrees of freedom
Multiple R-squared: 0.09905, Adjusted R-squared: 0.08028
F-statistic: 5.277 on 2 and 96 DF, p-value: 0.006692
11/8/2016 4 /15
Fit a Multiple Linear ModelFit a Multiple Linear Model # residual plots
>par(mfrow=c(2,2))
>plot(fit1)
11/8/2016 5 /15
OneOne--Way ANOVAWay ANOVA # get the overall F test
>anova(fit1)
Analysis of Variance Table
Response: fwt
Df Sum Sq Mean Sq F value Pr(>F)
grp 2 1600.1 800.04 5.2773 0.006692 **
Residuals 96 14553.8 151.60
# calculate the variance-covariance matrix for a fitted model
>vcov(fit1)
(Intercept) grp2 grp3
(Intercept) 2.368774 -2.368774 -2.368774
grp2 -2.368774 10.347804 2.368774
grp3 -2.368774 2.368774 11.843872
11/8/2016 6 /15
Least Significant Difference (LSD) ProcedureLeast Significant Difference (LSD) Procedure # LSD procedure from R session 07
> xbar <- tapply(fwt, grp, mean, na.rm = TRUE) # group mean
> s <- tapply(fwt, grp, sd, na.rm = TRUE) # group s.d
> n <- tapply(!is.na(fwt), grp, sum) # group sample size
> degf <- n - 1 # d.f. of groups
> total.degf <- sum(degf) # total d.f.
> ## the pooled variance
> pooled.sd <- sqrt(sum(s^2 * degf)/total.degf)
> # for pair i and j
> i=1; j=2
> dif <- xbar[i] - xbar[j]
> se.dif <- pooled.sd * sqrt(1/n[i] + 1/n[j])
> t.val <- dif/se.dif # test statistic
> t.val
3.244684
> 2 * pt(abs(t.val), total.degf, lower.tail=F)
0.001618783
11/8/2016 7 /15
Multiple Comparisons of GroupsMultiple Comparisons of Groups > help(pairwise.t.test)
Pairwise t tests
Description
Calculate pairwise comparisons between group levels with corrections for multiple testing
Usage
pairwise.t.test(x, g, p.adjust.method = p.adjust.methods, pool.sd = !paired, paired = FALSE, alternative = c("two.sided", "less", "greater"), ...)
Arguments
x response vector.
g grouping vector or factor.
p.adjust.method Method for adjusting p values (see p.adjust).
pool.sd switch to allow/disallow the use of a pooled SD
paired a logical indicating whether you want paired t-tests.
alternativea character string specifying the alternative hypothesis, must be
one of "two.sided" (default), "greater" or "less". Can be abbreviated.
... additional arguments to pass to t.test.
11/8/2016 8 /15
Fisher LSD Test Fisher LSD Test
>pairwise.t.test(fwt,grp,p.adjust.method = "none")
Pairwise comparisons using t tests with pooled SD
data: fwt and grp
1 2
2 0.0016 -
3 0.3955 0.0758
P value adjustment method: none
11/8/2016 9 /15
PP--Value Adjustment for Multiple Comparisons Value Adjustment for Multiple Comparisons >help("p.adjust")
Adjust P-values for Multiple Comparisons
Description
Given a set of p-values, returns p-values adjusted using one of several methods.
Usage
p.adjust(p, method = p.adjust.methods, n = length(p)) p.adjust.methods # c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", # "fdr", "none")
Arguments
p numeric vector of p-values (possibly with NAs). Any other R is coerced
by as.numeric.
method correction method. Can be abbreviated.
n number of comparisons, must be at least length(p); only set this (to
non-default) when you know what you are doing!
11/8/2016 10 /15
PP--Value Adjustment for Multiple Comparisons Value Adjustment for Multiple Comparisons
# Bonferroni approach
>pairwise.t.test(fwt,grp,p.adjust.method = "bonferroni")
Pairwise comparisons using t tests with pooled SD
data: fwt and grp
1 2
2 0.0049 -
3 1.0000 0.2273
P value adjustment method: bonferroni
# False discovery rate
>pairwise.t.test(fwt,grp,p.adjust.method = "fdr")
Pairwise comparisons using t tests with pooled SD
data: fwt and grp
1 2
2 0.0049 -
3 0.3955 0.1137
P value adjustment method: fdr
11/8/2016 11 /15
TwoTwo--Way ANOVA: No Interaction Effect Way ANOVA: No Interaction Effect # add the sex as another category >sex= factor(lead$sex[ids])
>fit2 = lm(fwt~grp+sex)
>summary(fit2)
Call:
lm(formula = fwt ~ grp + sex)
Residuals:
Min 1Q Median 3Q Max
-40.926 -5.771 -0.139 7.229 32.031
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 53.926 1.881 28.669 < 2e-16 ***
grp2 -10.309 3.241 -3.181 0.00198 **
grp3 -2.956 3.456 -0.856 0.39440
sex2 1.213 2.542 0.477 0.63435
Residual standard error: 12.36 on 95 degrees of freedom
Multiple R-squared: 0.1012, Adjusted R-squared: 0.07282
F-statistic: 3.566 on 3 and 95 DF, p-value: 0.01702
11/8/2016 12 /15
TwoTwo--Way ANOVA: No Interaction Effect Way ANOVA: No Interaction Effect # print out two-way ANOVA
>anova(fit2)
Analysis of Variance Table
Response: fwt
Df Sum Sq Mean Sq F value Pr(>F)
grp 2 1600.1 800.04 5.2348 0.006971 **
sex 1 34.8 34.79 0.2277 0.634354
Residuals 95 14519.0 152.83
11/8/2016 13 /15
TwoTwo--Way ANOVA: Interaction Effect Way ANOVA: Interaction Effect >fit3 = lm(fwt~grp*sex)
>summary(fit3)
Call:
lm(formula = fwt ~ grp * sex)
Residuals:
Min 1Q Median 3Q Max
-41.270 -6.207 0.333 7.436 29.730
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.2703 2.0095 27.006 < 2e-16 ***
grp2 -13.8087 3.9410 -3.504 0.000707 ***
grp3 -0.1592 4.5431 -0.035 0.972128
sex2 0.3964 3.0939 0.128 0.898329
grp2:sex2 10.8087 6.7800 1.594 0.114281
grp3:sex2 -6.3647 6.8934 -0.923 0.358242
Residual standard error: 12.22 on 93 degrees of freedom
Multiple R-squared: 0.1398, Adjusted R-squared: 0.09355
F-statistic: 3.023 on 5 and 93 DF, p-value: 0.01424
11/8/2016 14 /15
TwoTwo--Way ANOVA: Interaction Effect Way ANOVA: Interaction Effect # print out two-way ANOVA
>anova(fit3)
Analysis of Variance Table
Response: fwt
Df Sum Sq Mean Sq F value Pr(>F)
grp 2 1600.1 800.04 5.3545 0.006295 **
sex 1 34.8 34.79 0.2329 0.630535
grp:sex 2 623.3 311.67 2.0860 0.129960
Residuals 93 13895.6 149.42
11/8/2016 15 /15
The End
11/8/2016 16 /15