27
Computing for Research I Spring 2014 Primary Instructor: Elizabeth Garrett-Mayer R: EDA and writing commands March 31

Computing for Research I Spring 2014

  • Upload
    fynn

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

Computing for Research I Spring 2014. R: EDA and writing commands March 31. Primary Instructor: Elizabeth Garrett-Mayer. Some more on graphics. 3-D”ish” plots image contour. contour and image plot. EDA (exploratory data analysis). Make figures! plot, boxplot, hist , etc. - PowerPoint PPT Presentation

Citation preview

Page 1: Computing for Research I Spring 2014

Computing for Research ISpring 2014

Primary Instructor: Elizabeth Garrett-Mayer

R: EDA and writing commandsMarch 31

Page 2: Computing for Research I Spring 2014

Some more on graphics

• 3-D”ish” plots– image– contour

Page 3: Computing for Research I Spring 2014

contour and image plot

Page 4: Computing for Research I Spring 2014

EDA (exploratory data analysis)

• Make figures!• plot, boxplot, hist, etc.• Point estimates and confidence intervals

– t.test: in addition to p-value, gives mean and confidence interval.

– binom.test: estimate + one-sample test + confidence interval. Exact calculations.

Page 5: Computing for Research I Spring 2014

Binomial tests

– binom.test: one-sample test and confidence interval. assumes null is p=0.50 unless specified. Exact calculations.

– prop.test: tests that proportions are the same across groups. Chi-square-based.

– fisher.test: tests that proportions are the same across groups. Exact calculations.

Page 6: Computing for Research I Spring 2014

Other basic tests

• wilcox.test: ranksum test (2 groups) or signed rank.

• kruskal.test: ranksum test (>= 2 groups)• shapiro.test: Shapiro-Wilk normality test• mantelhaen.test: Mantel-Haenszel test

Page 7: Computing for Research I Spring 2014

Creating commands in R• General syntax

function.name <- function(x, y,… z=T, w=NULL) { # type in all the stuff you want the function to do . . # at the end, you usually return something return(z) }

– choose a meaningful function name!– in above,

• x, y would be required.• z and w have defaults and so are NOT required arguments

Page 8: Computing for Research I Spring 2014

When to create a function?

• When you have something you want to do more than once

• Can be a simple routine that you use regularlye.g. power calculation for an odds ratio based on fixed prevalence of ‘disease’ and varying prevalence of exposure

• Can be a routine that you want to repeat over a large set of variables, yet is specific to a data analysis

Page 9: Computing for Research I Spring 2014

Very simple example

• Trimmed mean

trimmean <- function(x) {y <- sort(x)n <- length(y)y <- y[-c(1,n)]meany <- mean(y)return(meany)}

Page 10: Computing for Research I Spring 2014

Try it outz <- c(-20,1,2,4,7,9,50,100)mean(z)trimmean(z)

z <- (1:100)^2mean(z)trimmean(z)

Page 11: Computing for Research I Spring 2014

What about trimming more?# what about trimming more?

trimmean <- function(x, ntrim=1) {y <- sort(x)n <- length(y)v1 <- 1:ntrimv2 <- (n-ntrim+1):ny <- y[-c(v1,v2)]meany <- mean(y)return(meany)}

Page 12: Computing for Research I Spring 2014

Checking out what your function is doing

trimmean <- function(x, ntrim=1) {y <- sort(x)n <- length(y)v1 <- 1:ntrimv2 <- (n-ntrim+1):n

print(c(v1,v2))y <- y[-c(v1,v2)]meany <- mean(y)return(meany)}

Page 13: Computing for Research I Spring 2014

Try it againz <- c(-20,1,2,4,7,9,50,100)mean(z)trimmean(z, ntrim=2)

z <- (1:100)^2mean(z)trimmean(z, ntrim=10)

Page 14: Computing for Research I Spring 2014

common ones to have savedlogit <- function(p) {

return(log(p/(1-p)))}

unlogit <- function(x)return(exp(x)/(1+exp(x)))

oddsratio <- function(x,y) {tabi <- table(x,y)or <- (tabi[1,1]*tabi[2,2])/(tabi[1,2]*tabi[2,1])return(or)}

**note: no need to have“{“ and “} “if your function only has one line

Page 15: Computing for Research I Spring 2014

Example: Ford Study• Objectives. We conducted a community based cancer clinical trials

education intervention in South Carolina (SC), which has high rates of cancer disparities. However, African Americans are less likely than other groups to participate in clinical trials. Low participation rates appear to be an outcome of negative trial perceptions.

• Methods. We conducted the intervention at 10 sites in eight counties. The intervention consisted of a 30-minute cancer clinical trials educational presentation. It was a component of a larger 4-hour cancer education program. Pre- and post-intervention surveys were administered. The 7-item Fallowfield instrument was used to assess perceptions of cancer clinical trials. Fisher’s exact tests were used to compare the proportion of participants who changed their responses from pre-test to post-test.

Page 16: Computing for Research I Spring 2014

Fallowfield Instrument Items

1. Do you think that patients should be asked to take part in medical research?

2. Suppose that you were asked to take part in a research study comparing two treatments, both of which were suitable for your illness. Would you be prepared to take part in a study comparing different treatments?

3. Usually the only scientific way to compare one treatment with another is for the choice between the two to be made randomly, rather like tossing a coin. Would you be prepared to take part in a study where treatment was chosen at random.

4. If you answered “No” or “Do not know” to Question 3, we would now like to ask you a bit more about this. In a randomized study a choice would be made between two treatments, either of which would be suitable for you. Your doctor and experts in the field do not know for sure if one treatment is better than the other, or if they are both the same, that’s why they want to do the study. Would knowing that encourage you to take part?

5. In a random choice study, if the treatment you were receiving did not suit you for any reason you could leave the study. Your doctor would then give you whatever other treatment might be appropriate for you. Would that encourage you to take part? 6. Before you agreed to enter a random choice study the doctor would tell you all about the two treatments being compared, before you were allocated to one or the other. Would that encourage you to take part?

7. If you knew all the following things were taken in account, would you change your mind and agree to take part in the study? Both treatments were completely suitable You could leave the study if the treatment did not suit you There is plenty of information before the random choice was made

Page 17: Computing for Research I Spring 2014

Example: Evaluating an Intervention to Improve Clinical Trial Perceptions among Racially Diverse Communities in South Carolina

0.0

0.2

0.4

0.6

0.8

1.0

Pro

porti

on C

hang

ing

Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Page 18: Computing for Research I Spring 2014

Create a function for estimating the proportions for all 7 items

• Goal for EACH item:– estimate proportion changing from N/DK to Y– estimate proportion changing from Y to N/DK– estimate confidence intervals for proportions– test that proportions are different– plot each proportion and confidence interval on a

graph– show p-value on figure

Page 19: Computing for Research I Spring 2014

Exampletable(prects[,1])table(postcts[,1])table(prects[,1], postcts[,1])

Post CTS1

0 1

Pre CTS1 0 a c

1 b d

Page 20: Computing for Research I Spring 2014

Proportions of interest

• P(N/DK to Y|N/DK) = c/(a+c)• P(Y to N/DK|Y) = b/(b+d)

Post CTS1

0 1

Pre CTS1 0 a c

1 b d

Page 21: Computing for Research I Spring 2014

Example

• Assume you have a table of values, tabi• What do we want to do with the table?

aa <- tabi[1,1]bb <- tabi[2,1]cc <- tabi[1,2]dd <- tabi[2,2]

t1 <- binom.test(bb, bb+dd ) # y to n/dkt2 <- binom.test(cc, aa+cc) # n/dk to ytab2 <- matrix( c(aa, dd, cc, bb), byrow=F, ncol=2)p <- fisher.test(tab2)$p.value

Page 22: Computing for Research I Spring 2014

Create a vector of output, with labels

vectr <- c( t1$estimate, t1$conf.int[1], t1$conf.int[2], t2$estimate, t2$conf.int[1], t2$conf.int[2],

p)

names(vectr) <- c("p1","Lci1", “Uci1", "p2", "Lci2", "Uci2", "p")

vectr <- round(vectr,4)

Page 23: Computing for Research I Spring 2014

Put it all together in a functiontwobytwo <- function(tabi) {

aa <- tabi[1,1]bb <- tabi[2,1]cc <- tabi[1,2]dd <- tabi[2,2]

t1 <- binom.test(bb, bb+dd)t2 <- binom.test(cc, aa+cc)tab2 <- matrix( c(aa, dd, cc, bb), byrow=F, ncol=2)p <- fisher.test(tab2)$p.value

vectr <- c(t1$estimate, t1$conf.int[1], t1$conf.int[2],t2$estimate, t2$conf.int[1], t2$conf.int[2],p)

names(vectr) <- c("p1","Lci1", "Uci1","p2","Lci2","Uci2","p")vectr <- round(vectr,4)

return(vectr)}

Page 24: Computing for Research I Spring 2014

What about making the figure?

• All of the results needed are already generated in the function and stored in vectr.

• Just need to include where to put the results:– Step 1: set up a plotting area– Step 2: include points and lines commands within

function

Page 25: Computing for Research I Spring 2014

Add options to function, code to function

twobytwo.figure <- function(tabi, i=1, coll=1, diff = 0.2, plt=F) {

...if(plt==T) {

points(c(i-diff,i+diff), vectr[c(1,4)], pch=16, cex=1.5, col=coll)lines(rep(i-diff,2), vectr[c(2,3)], lty=1, lwd=2, col=coll)lines(rep(i+diff,2), vectr[c(5,6)], lty=2, lwd=2, col=coll)ptext <- ifelse(p<0.0001,"<0.0001",as.character(round(p,4)))

text(i,-0.1, labels=ptext)

}

...What is diff?What is coll?Why plt=T or F?

Page 26: Computing for Research I Spring 2014

Set up plotting figure

# set up plotpar(mar=c(6.5,4,2,2))plot(c(0.5,7.5), c(0,1), type="n", xaxt="n", xlab="",

ylab="Proportion Changing")abline(h=c(0,1))abline(v=seq(0.5,7.5,1), lty=3)labs <- paste("Item ",c(1,2,3,4,5,6,7))mtext(labs, side=1, at=1:7, line=5)axis(1, at=(sort(rep(1:7,2))+rep(c(-0.2,0.2),7)),

labels=rep(c("Y to N/DK","N/DK to Y"),7) , las=2, cex.axis=0.8)

# add lines to figurefor(i in 1:7) {

tabi <- table(prects[,i], postcts[,i])twobytwo.figure(tabi,i, plt=T, coll="darkgreen")

}

Page 27: Computing for Research I Spring 2014

0.0

0.2

0.4

0.6

0.8

1.0

Pro

port

ion

Cha

ngin

g

Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y

Y to

N/D

K

N/D

K to

Y6e-04 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001