Upload
vickey-mehriya
View
228
Download
0
Embed Size (px)
Citation preview
Getting Started with R Alok Srivastava Lecture - 02
Getting started with R
Alok SrivastavaCRRAO-AIMSCS, Hyderabad, INDIA
Jan 08, 2015
Topics
Basics of R Programming Alok Srivastava 101212
Topics
1 How to use R
2 Data types in R
3 Data creation
4 Data curation
Basics of R programming Lecture - 02Getting Started with R Lecture 02
Topics
Basics of R Programming Alok Srivastava 101212
Topics
1 How to use R
2 Data types in R
3 Data creation
4 Data curation
Basics of R programming Lecture - 02Getting Started with R Lecture 02
Topic 1 : How to use R
Basics of R programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Check your Working Path
R installation directory R.home() # R installation directory {which R}
Check your working path getwd() # To get the location of current working
directory
Linux /home/alok/WorkShop/2014/Workshop_UoH_14_Jan/Lecture2
WindowsC:/Users/Alok/Documents
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02
Itll help to load the path
Getting Started with R Lecture 02
H1
Change your Working Path
Change your working path setwd() # To change the location of working
directory
Recheck your working directory getwd()
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H2
Strings
Working with Text editor
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
- Use hash # to comment
Use R as Calculator
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Airthmetic Operators
Addition + Subtraction - Multiplication * Division / Exponent ^ OR ** Modulus (x mod y) x%%y Integer Division x%/%y
H3 H4
Variable
H5
Mulitple Variable
Workspace in R
Save Workspace Save workspace
save.image() # Default file .Rdata unlink(.RData) # To remove save.image(mywork.Rdata) # In specific file load(mywork.Rdata) # Load previous work savehistory(file=abc) # Save in txt file, default .Rhistory loadhistory(file=abc) # Load history from file
Quit Session q() # It will ask to save the workspace
image? [y/n/c]
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Character variable in R
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
- to store name or categorical variable
- Use double quotes to store variable
- Use with c operator to store multiple values
H6
Getting help in R Within R:
The ? Command can be used to get help on a specific command within R ? keyword or help(keyword) # Command search library(help=pamr) ??keyword or help.search(keyword) # If dont know function apropos("mean") # list all functions containing
string meanSearch library functions
library(help=base) # List of base function available with R console library(help=samr) # To display the list of function available in
package samr. But to display the help page, first we have to load the library.
Documentation Help files can be accessed in the text file or html format. Manuals, reference cards, tutorials and news about recent developments
are available at http://www.r-project.org/other-docs.html Online help
R-help : https://www.stat.math.ethz.ch/pipermail/r-help/ Bioconductor-help : https://stat.ethz.ch/mailman/listinfo/bioconductor
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H6-b
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02
PracticeSession:1
Getting Started with R Lecture 02
Topic 2 : Data Types in R
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Variable types in R Numeric
Integer x = 6is.real(x) # TRUEis.integer(x) # FALSE
Logicalx = c(1,2,3,4,5); y = (x
Vectors in R Vectors may have mode logical,numeric,character.
Examples of Vectorsx = c(45, 90, 135 )
y = c("Kinjal","Madhav","Roopa","Suraj")
z = c(" gene1 " , " gene2 " , " gene3 " , " gene4" )
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Arrays in R Vectors may also have mode logical,numeric,character. Two dimension array is same as matrix
Examples of Two dimension arrayx = array(data, dim)
x = array(1:3, c(2,4))
Examples of Three dimension array x = array(1:3, c(2,4,2)) # 2, represent the dimension
x = array(1:3, c(2,4,3)) # 3, represent the dimension
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Matrices in R
Col1 Col2 Col3 Col4
Row1
Row2
Row3
Is a matrix
Dimension : 3 X 4
Row names : Row1, Row2, Row3
Column names : Col1, Col2, Col3
Row size: 3
Column size: 4
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Data frames in R
Col1 Col2 Col3 Col4
Row1
Row2
Row3
Data frame is a generalization of a matrix
Different column may have different data types
All elements of any column must ,have the same datatype, i.e. all numeric, or all factor, or all character, or all logical
Use for R modeling and graphical functions
If the data is read in using the command read.csv, read.txt etc, it will automatically be saved as a data frame.
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Data lists in R
Row1
Row2
Row3
Data list is arrangement of different lists
Different rows may have different number of variables
All elements of any rows must ,have the same datatype, i.e. all numeric, or all factor, or all character, or all logical
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Topic 3 : Data Creation
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Variable types in R Numeric
Integer x = 6is.real(x) # TRUEis.integer(x) # FALSE
Logicalx = c(1,2,3,4,5); y = (x
Vectors in R Vectors may have mode logical,numeric,character.
Examples of Vectorsx = c(45, 90, 135 )
y = c("Kinjal","Madhav","Roopa","Suraj")
z = c(" gene1 " , " gene2 " , " gene3 " , " gene4" )
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H8
Data Creation : Vectors and Arrays Data creation :
c(1,2,3,4) # combine argument to create a vector
from:to # create sequence from to to
seq(from,to,by=diff) # create airthmetic series
rep(c(1,2,3,4),4) # Replicate Elements of Vectors
array(1:3, c(2,4)) # create array of size 2X4
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H9
Arrays in R Vectors may also have mode logical,numeric,character. Two dimension array is same as matrix
Examples of Two dimension arrayx = array(data, dim)
x = array(1:3, c(2,4))
Examples of Three dimension array x = array(1:3, c(2,4,2)) # create two array of size 2X4
# 2, represent the dimension
x = array(1:3, c(2,4,3)) # 3, represent the dimension
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H10
Data creation : Matrices in RA = 1:3B = 4:6c = 7:9
# cbind combined object by ColumnX = cbind(a,b,c)
# rbind combined object by RowY = rbind(a,b,c)
# Matrix by defining number of rows and columnsZ = matrix(c(1,4,6,2,3,7.8), nrow=2, ncol=3, byrow=T)
Z = matrix(c(1,4,6,2,3,7.8), nrow=2, ncol=3, byrow=F)
expression_data = matrix(c(1,2,3, 11,12,13), nrow = 2, ncol=3, byrow=TRUE,dimnames = list(c("gene1", "gene2"),c("Sample.1", "Sample.2", "Sample.3")))
# To generate random matrix of 10 rows and 5 columnsreplicate(5, rnorm(10))
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H11
Data Creation : Data framesData Frame :
go.term = c (GO0009117,GO0009253,GO0009354) gene.count = c(15,18,25) avg.expression.value = c(0.5432,0.2371,0.7867) go.term.rank.rank= c(2,1,3)
mydata = data.frame (go.term,gene.count,avg.gene.expression,go.term.rank)
mydata2 = data.frame(rank=1:4,gene_name=c("ddr1","apr2","bac","p53"),n=c(.90,.75,.52,.31));
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H12
Data Creation : Data ListsData List :
genelist1 = c (abc1,abc2) genelist2 = c(brca1,brca2,tp53,mdm2) genelist3 = c(apr,erpn,myc)
mylist = list (genelist1,genelist2,genelist3)
mylist2 = list(rank=1:4,gene_name=c("ddr1","apr2","bac","p53"),n=c(.90,.75,.52,.31));
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H13
List : collection of several objects of any type x1 = c("gene1","gene2","gene3","gene4",gene5) x2 = c(2,4,7,9,11) x = list(x1,x2)
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
PracticeSession:2
Topic 4 : Data Curation
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Variable Information is.na (x) # To identify missing values is.array(x) # To store one, two or more dimension data is.vector(x) # One dimension array is.matrix(x) # Two dimension array is.data.frame(x) is.numeric(x) is.complex(x) is.character(x)
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H14
Variable conversion
as.vector(x) as.matrix(x) as.data.frame(x) as.character(x)
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H14
Variable attributes
Attributes length(x) # Length of vector dim(x) # Dimension of matrix dimnames(x) # Dimension names
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H15
Missing ValuesVariables of each data type (numeric, character, logical) can also take the
value NA: not available.
- NA is not the same as 0
- NA is not the same as
- NA is not the same as FALSE
For any operations (calculations, comparisons) that involve NA, we have to logically indicate whether missing values should be considered or removed.
> NA==1
[1] NA
> 1+NA
[1] NA
> max(c(NA, 4, 7))
[1] NA
> max(c(NA, 4, 7), na.rm=T)
[1] 7
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H16
Data selection and manipulation Slicing and Extracting Data : Vectors
x[n] # nth element x[-n] # all but nth element x[1:n] # first n element x[-c(1:n)] # elements from n+1 to end x[c(2,5,7)] # specific elements x[x>5] # all elements greater than 5 x[x5 & x < 9] # all elements between 5 and 9 x[x %in% c("ab","sh")] # elements in given vector
Data selection from list and data frame : x[[n]] # nth element of the list x$name # extract x attribute with variable name attributes(x) # attributes of data frame
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H17
Basic Matrix operation Matrix curation :
x[r,c] # element at rth row and cth column x[r,] # row r X[,c] # column c x[c(2,5,8)] # To select specific column
Matrix operation: dim(x) # Dimesnion of matrix x+y # Sum of matrix x and y dim(x) # Dimesnion of matrix t(x) # Transpose of matrix diag(x) # Diagonal element of matix nrow(x) # numer of rows rownames(x) # row names rowSums(x) # row sum rowMeans(x) # row means
cor(x) # correlation matirx var(x) # variance matrix
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H18
Data selection and manipulation Data selection and manipulation :
X*2 # scalar multiplication length(x) # length of the vector sum (x) # sum of element in vector max, min # max and min values rev # reverse in order sort # sorting unique # unique rle # run length encoding table(a,b) # comparison table sample(x) # for random sampling of the data which.max(x) # return index of the max elements of x. Which.min(x) # return index of the min elements of x. Which (x == a) # returns a vector of indices of x, if
comparsion operator is TRUE Which (x %in% a) # return index which matches with a choose (n,k) # combinations of k events among n repetitions. rank(x) # ranking round(x,3) # round the element of x to 3 decimal places order
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H19
Basic Math and Statistics function Basic Maths and Statistics functions:
sqrt(x) # square root of the function sin(x), cos(x), # trignometry functions asin(x), acos(x) # inverse trignometry functions log(x), log10(x), log(x,base) # log exp(2) # exponential function max(x), min(x), # min and max value range(x) # range sum(x) # sum of x
mean(x) # mean of the elements of x median(x) # median of the elements of x var(x) # variance of the element of x sd(x) # standard deviation of x
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H20
Advance R Built-in function functions:
abs(x) # absolute values ceiling(x) # Ceiling floor(x) # floor trunc(x) # trunc round(3.4578) # round, decimal place signif(3.4578) # signif, significant digits
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
H21
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
PracticeSession:3
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Exercise:1
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02
Exercise 1: Round off the number 3.543321 up to three decimal place.
Exercise 2: Generate a sequence, x=seq(1,524,d). where d is a random number between 2 to 9. Find
length(x) sum(x) cube root of x extract 5, 7th element from vector x extract 2nd to 5th element from vector x create vector without 2nd to 5th element from
vector x which elements of vector x are greater than 10 find a vector whose elements are greater than 10 find a vector whose elements are greater than 10
and less than 50 find: max, min, rev, sort, unique, range
Getting Started with R Lecture 02
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02
Exercise 3: Explore the commands a=rep(2,5) b=rep(3,7) c=rep(4,2) z2= c(a,b,c) z= sample(z2) # analyze z u = rle(z) sort (z) unique (z) what sample command does? attributes of u analyze u # Interpret what rle does mean, median, var,sd, convert z into log scale
Getting Started with R Lecture 02
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02
Exercise 4: Generate 20 replicate of TRUE sample denoted by T, and 10 replicate of FALSE sample denoted by F.
Exercise 5: Generate a vector z1 from 2 to 5, second vector z3 from 12 to 15, and combine them into a new vector z.
Exercise 6: Write the sequence expression for 5 10 15 20 25 30 35 40 45 50
Exercise 7: Generate a sequence start with 19 to 957, with a difference of 17.
Exercise 8: Generate any 3X4 matrix using command matrix
Exercise 9: Ceate 3 vectors, a,b,c of size 5, generate a matrix using cbind and rbind, calculate the dimension of matrix.
Getting Started with R Lecture 02
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02
Exercise 10: A class of 20 student, appeared for maths and biology exam, secure marks between 20 to 90. Genearte random marks, satisfying the above criteria in a matrix, that contain First Row as name of the student as S1, S2, ...., S20, and first column as the subject math1 and bio1 respectively.Save the name of the students and marks of the student who,
Secure more than 70 % marks in either of two subjects, and
Fail in either of two. Average marks secure by students in both subjects
Getting Started with R Lecture 02
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02
DontforgettosaveWorkspace..
Getting Started with R Lecture 02
THANK YOU .........
Alok SrivastavaAssistant Professor, CRRAO- AIMSCS, Hyderabad, INDIA
Date 8-01-15
Basics of R Programming Alok Srivastava 101212Basics of R programming Lecture - 02Getting Started with R Lecture 02
Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43Slide 44Slide 45Slide 46