30
The R Project for Statistical Computing

The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

The R Project for Statistical Computing

Page 2: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

What is R?What is R?

“R is a free software environment for statistical computing and graphics.”

(http://www.r-project.org/)

●Software environment●Statistical computing●Graphics

Page 3: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

What is R?What is R?

An interpreted language.Provides control structures (loops).

Interface with other languages (e.g., C, fortran).

●Software environment●Statistical Computing●Graphics

Page 4: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

What is R?What is R?

Univariate methods: ANOVA, Linear regression, etc.

Multivariate methods: PCA, Clustering, Multiple regression.

Bayesian tools (MCMC).

Distributions: rnorm(), rbinom(), etc.

Simulations: Coupling distributions with control loops.

●Software environment●Statistical Computing●Graphics

Page 5: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

What is R?What is R?

Scatterplots.Histograms.

Density functions.Box and whisker plots.

Maps.3D surfaces.

3D scatterplots.Heatmaps.

●Software environment●Statistical Computing●Graphics

Page 6: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

Go to Rand web.

Page 7: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R BackgroudR BackgroudR is an open source, free version (dialect) of S.

S-plus (www.insightful.com) is a commercial version of S with a GUI (Graphical User Interface).

R “is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues.“

GNU's Not UNIX. The GNU Project was launched in 1984 to develop a complete Unix-like operating system which is free software.

Page 8: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

What is R?What is R?R is an interpreted language (e.g., it is compiled on the fly).

Because it it interpreted it is relatively user friendly: - commands can be sent per line - commands can be sent as part of a line:

Windows GUI text editor.ESS – Emacs Speaks Statistics (UNIX).

This allows for 'trial and error' debugging of code.

Page 9: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

Rcmdr – R Commander, a GUI for R!

Page 10: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R SyntaxR Syntax

> my.data <- read.table(file=”data.txt”, + header=TRUE, sep=”\t”)

R functionParameters for the function

included in parenthesis.

Try: > help(read.table)

Page 11: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R SyntaxR Syntax

> my.data <- read.table(file=”data.txt”,+ header=TRUE, sep=”\t”)

Assignment operator

(also see <-; ->; =)

'Object' in which to store your data

Page 12: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

Working directories Working directories and slashesand slashes

R operates within a 'working directory.'try getwd(), setwd()

Unix uses forward slashes '/' therefore R uses forward slashes.

Windows uses backward slashes '\' and must be changed.

Page 13: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R Data StructuresR Data Structures

vector - c(2, 9, 1000, 546, 1, 1, 45)

matrix – 8 X 7 (numbers only)

data.frame – A matrix that can include non-numbers (e.g., factors)

list – A vector of vectors. - Most R functions return a list- Packages that define their own data

structures usually define special instances of lists.

Page 14: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

IndexingIndexingR Data StructuresR Data Structures

my.vector[3]

my.data.frame[3,5]

my.list[[4]][5,8]

Indexes the third element in the vector.

Indexes the element in third row and the fifth

column.

Indexes the fourth element in the list (here

its a data.frame) and Indexes the element in fifth row and the eigth

column.

Page 15: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

Functions that Functions that create vectorscreate vectors

> 3:8 #returns (3, 4, 5, 6, 7, 8)

> rep(3, times = 4) #returns (3, 3, 3, 3)

> seq(from = 1, to = 2, by = 0.2) #returns (1, 1.2, 1.4, 1.6, 1.8, 2.0)

Useful for indexing:

> my.data.frame[3:5, 18:22]

Indexes elements in rows 3-5 and in columns 18-22.

Page 16: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R Help SystemR Help System

R help is largely command based.If you don't know the command. . .

help(command) – man page style documentation.

- Searches active libraries only.

?command is a synonym for help(command)

See Also: section has related commands.

package vignettes are process oriented – not always present.

help.start()

Page 17: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

Go to iris example

Page 18: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R Graphics outputR Graphics outputIn Windows you can copy and paste .emf & .bmp.

win.metafile() # Windows specific.

?Devices

jpeg() and png()postscript()pdf()

dev.cur()

dev.off() # Turn off the device when done

Cairo – R graphics device which includes .tif

Page 19: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R PerformanceR PerformanceR is an interpreted language (e.g., it is compiled on the fly).

This makes R inherently slow compared to compiled languages (such as C, C++, and Fortran).

Loops (e.g., 'for') are computationally intensive.

'apply' family of functions: sapply(), lapply(), apply(), mapply(), tapply(), rapply()

exectute a function repetetively.

Utilize compiled code:?.C & ?.Fortran

Page 20: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R SupportR Support

Get a book (Dalgaard, Peter. 2002. Introductory Statistics with R. Springer)

Google your question – google searches the wikis

Ask a friend!

Join an R user group!

Page 21: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R PackagesR PackagesMore tools!More tools!

ade4 # Analysis of ecological dataape # Analysis of Phylogenetics and EvolutionBioconductor # Bioinformatics (non-CRAN)Biodiversity # Ecological community analysisboot # Bootstrap functionsbqtl # Bayesian QTL mappingCairo # R graphics device (supports .tif)climatol # Tools for ClimatologycolorRamps # Builds color tables.Geneland # Landscape geneticsgeometry # Mesh generation, tesselationHSAUR # Handbook of Stat Analysis Using RkernelPOP # Spatially explicit population geneticsmaps # Draw Geographic MapsRcmdr # Basic GUI

Page 22: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R PackagesR PackagesMore tools!More tools!

R2WinBUGS # Run WinBUGS from RrJava # Low-level R to Java interfaceRMySQL # R interface to the MySQL databaseROracle # Oracle database interface to RSASxport # Read and write SAS XPORT filesseqinr # Biological sequences retrieval & analysisshapefiles # Read and write ESRI Shapefilesspgrass6 # Interface between GRASS GIS and Rspgwr # Geographically weighted regressionvegan # Community Ecology Package

Page 23: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done

R ResourcesR Resources

R-project for statistical computinghttp://www.r-project.org/

Bioconductor (microarray stuff):http://www.bioconductor.org/

The R Graphics Gallery:http://addictedtor.free.fr/graphiques/

Emacs Speaks Statistics:http://ess.r-project.org/

Page 24: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done
Page 25: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done
Page 26: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done
Page 27: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done
Page 28: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done
Page 29: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done
Page 30: The R Project for Statistical Computingpeople.oregonstate.edu/~knausb/Presentations/R_intro.pdf · Statistical Computing ... pdf() dev.cur() dev.off() # Turn off the device when done