93
An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith, D.M. and the R Core Team. An Introduction to R. 2. Zuur, Ieno and Meesters, 2009. A Beginner’s Guide to R. 3. Norman Matloff, 2011. The Art of R Programming.

An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Embed Size (px)

Citation preview

Page 1: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

An Introduction to R

Prof. Ke-Sheng Cheng

Dept. of Bioenvironmental Systems Engineering

National Taiwan University

References: 1. Venables, W.N., Smith, D.M. and the R Core Team. An Introduction to R.2. Zuur, Ieno and Meesters, 2009. A Beginner’s Guide to R.3. Norman Matloff, 2011. The Art of R Programming.

Page 2: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

The R-project• R is a free software. (www.r-project.org)• The S language.– S-Plus (a commercial software)

• R is an integrated software environment for data manipulation, calculation and graphical display.– An efficient data handling and storage facility,– A suite of operators for calculations on arrays, in particular

matrices,– A large, coherent, integrated collection of intermediate

tools for data analysis,

04/21/23 2Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 3: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

– Graphical facilities for data analysis and display either directly at the computer or on hardcopy,

– A well developed, simple and effective programming language which includes conditionals, loops, user defined recursive functions and input and output facilities.

• R packages (CRAN)– Standard packages– Other packages available at the Comprehensive R Archive

Network (CRAN)

• R community – – Local R users groups – R-bloggers (Useful and interesting examples and

discussions)04/21/23 3Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental

Systems Engineering, National Taiwan Univ.

Page 4: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Downloading and Installing R

• http://www.r-project.org/

04/21/23 4Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 5: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 5Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 6: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Starting an R session

04/21/23 6Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 7: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• In addition to R, there are also – RStudio – R Commander

• We will stick to R in this class.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

7

Page 8: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 8Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 9: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Working Environment of R

• The working environment of R can be illustrated by the following graph:

Directory 1

Directory 2

Workspace

Temporary memory

Working Directory

04/21/23 9Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 10: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Running R

• When you first start running R the default prompt is the “>” sign.

04/21/23 10Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 11: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Working directory– In using R, you need to know and

specify the working directory.This is done by clicking the Change dir button.

– One can specify different working directories for different projects.

04/21/23 11Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 12: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Getting help– >help(…) and >help.search(“….”)

04/21/23 12Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 13: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 13Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 14: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Executing commands from an external file

• R commands can be stored in an external file (for example, ksc.r) in the working directory. These commands can then be executed with the source command:> source (“ksc.r”) or

04/21/23 14Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 15: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Open and run an existing file

04/21/23 15Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 16: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Objects and Workspace• The entities that R creates and manipulates are

known as objects. These may be variables, arrays of numbers, character strings, functions, or more general structures built from such components.

• During an R session, objects are created and stored by name. The R command> objects() (alternatively, ls())

can be used to display the names of (most of) the objects which are currently stored within R.

• The collection of objects currently stored is called the workspace.

04/21/23 16Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 17: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Data permanency and removing objects• To remove objects the function rm is available:

> rm(x, y, z, ink, junk, temp, foo, bar)• All objects created during an R sessions can be stored

permanently in a file for use in future R sessions. At the end of each R session you are given the opportunity to save all the currently available objects. If you indicate that you want to do this, the objects are written to a file called ‘.RData’ in the current directory, and the command lines used in the session are saved to a file called ‘.Rhistory’.

04/21/23 17Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 18: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• When R is started at later time from the same directory it reloads the workspace from this file (.RData). At the same time the associated commands history is reloaded.

• Remove all objects in the workspace– rm(list=ls())

• Clear the screen– Ctrl l

04/21/23 18Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 19: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Reading data from files

• The read.table() function

04/21/23 19Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 20: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• The scan() function• The read.csv() function

04/21/23 20Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 21: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Output data to files

• write, write.table, write.csv

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

21

write(x,”output.txt”,ncolumns=10,append=TRUE,sep="\t")

write(round(x,digits=2),”output.txt”,ncolumns=10,append=TRUE,sep="\t")

Page 22: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Redirecting outputs• Sink

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

22

Page 23: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Objects, their modes and attributes

• Intrinsic attributes: mode and length– The entities R operates on are technically known as

objects. Examples are vectors of numeric (real) or complex values, vectors of logical values and vectors of character strings.

– These vectors are known as “atomic” structures since their components are all of the same type, or mode, namely numeric, complex, logical, character and raw.

04/21/23 23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 24: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Atomic structures of R– Vectors must have their values all of the same

mode. Thus any given vector must be unambiguously either logical, numeric, complex, character or raw. (The only apparent exception to this rule is the special “value” listed as NA for quantities not available, but in fact there are several types of NA).

– Note that a vector can be empty and still have a mode. For example the empty character string vector is listed as character(0) and the empty numeric vector as numeric(0).

04/21/23 24Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 25: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Recursive structures of R

• R also operates on objects called lists, which are of mode list. These are ordered sequences of objects which individually can be of any mode.

• lists are known as “recursive” rather than atomic structures since their components can themselves be lists in their own right.

04/21/23 25Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 26: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• The other recursive structures are those of mode function and expression.

• Functions are the objects that form part of the R system along with similar user written functions.

• Expressions are objects which form an advanced part of R.

04/21/23 26Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 27: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

An example of using function

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

27

# AR2_Bootstrap.R # Coded by KSC 08232011 at the University of Bristol -------------# AR modeling of the flow data seriesx=read.csv("Nine_flow_events.csv",sep=",")n.event=9n.bt=1000 # number of bootstrap samplesalpha1=c();alpha2=c();alpha3=c();alpha0=c()predct=c()par.ar=matrix(rep(0,n.event*4),ncol=4,nrow=n.event)file.name=paste("event",1:n.event,".txt",sep="")bt.name=paste("bootstrap",1:n.event,".txt",sep="")#------------------------------------------------------------------# Function -- AR(2) Forecastingforecast=function(obs,par1,par2,par3,predct){L=length(obs)u1=0;u2=0obs=c(u1,u2,obs)for (i in 1:L) predct[i]=par3+par1*obs[i+1]+par2*obs[i]err=obs[3:(L+2)]-predctout=c(predct,err)return(out)}#------------------------------------------------------------------

Page 28: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

28

# AR(2) Modeling, forecasting and bootstrapping of individual seriesfor (i in 1:n.event) {

event=x[[i]][!is.na(x[i])]# AR(2) Modeling ------------

windows()pacf(event)ar.event=arima(event,order=c(2,0,0))alpha1[i]=ar.event[[1]][1]alpha2[i]=ar.event[[1]][2]alpha3[i]=ar.event[[1]][3]alpha0[i]=(1-alpha1[i]-alpha2[i])*alpha3[i]par.ar[i,]=c(alpha0[i],alpha1[i],alpha2[i],alpha3[i])

## AR(2) Forecasting ---------

out.4cast=forecast(event,alpha1[i],alpha2[i],alpha0[i],predct)err=out.4cast[(length(event)+1):(2*length(event))]err.star=err-mean(err)write(event,file.name[i],ncolumns=10,append=TRUE,sep="\t")

write(out.4cast[1:length(event)],file.name[i],ncolumns=10,append=TRUE,sep="\t")write(err,file.name[i],ncolumns=10,append=TRUE,sep="\t")

#

Page 29: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

29

# Model-based Time Series Bootstrapping --------------btsample=matrix(rep(0,n.bt*(2+length(err))),nrow=n.bt,ncol=2+length(err))for (j in 1:n.bt){epsilon=sample(err.star,size=length(err),replace=TRUE)for (k in 3:(2+length(err))){btsample[j,k]=alpha0[i]+alpha1[i]*btsample[j,k-1]+alpha2[i]*btsample[j,k-

2]+epsilon[k-2]} write(btsample[j,3:

(2+length(err))],bt.name[i],ncolumns=10,append=TRUE,sep="\t")}

## Plot observed and bootstrap sample series

windows()z=scan(bt.name[i],sep="\t")plot(0,0,type="n",xlim=c(0,length(event)),ylim=c(min(z),max(z)))dim(z)=c(length(event),n.bt)for (j in 1:n.bt) lines(1:length(event),z[,j],type="l")lines(1:length(event),event,type="l",col="red",lwd=3)

}par.ar

Page 30: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

An example using function ecdf

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

30

# ECDF_Plot.R# Coded by KSC 09242011 -----------------n.sample=9 # Number of samples in.file=paste("CECP",1:n.sample,".txt",sep="")windows()plot(0,0,type="n",xlim=c(-1,1),ylim=c(0,1))for (i in 1:n.sample){

x=scan(in.file[i],sep="\t")n.L=length(x)x1=x[1:(n.L/2)]x2=x[(1+(n.L/2)):n.L]x1.ecdf=ecdf(x1);x2.ecdf=ecdf(x2)u=seq(-1,1,by=0.005);v=x1.ecdf(u)lines(u,v,type="l",col=i,lwd=3)v1=round(mean(x1),digits=4)v2=round(sqrt(var(x1)),digits=4)v3=round(mean(x2),digits=4)v4=round(sqrt(var(x2)),digits=4)print(c(v1,v2,v3,v4))

}

Page 31: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Combination and Permutationperm = function(n, x) {

return(factorial(n) / factorial(n-x))}

comb = function(n, x) { return(factorial(n) / (factorial(x) * factorial(n-x)))}

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

31

Page 32: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• The functions mode(object) and length(object) can be used to find out the mode and length of any defined structure.

04/21/23 32Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 33: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Changing the mode of an object• as.character(x)• as.integer(x)

04/21/23 33Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 34: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Changing the length of an object• An “empty” object may still have a mode. For

example

makes e an empty vector structure of mode numeric.• Once an object of any size has been created, new

components may be added to it simply by giving it an index value outside its previous range.

04/21/23 34Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 35: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Other examples

04/21/23 35Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 36: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

The class of an object

• All objects in R have a class, reported by the function class. For simple vectors this is just the mode, for example "numeric", "logical", "character" or "list", but "matrix", "array", "factor" and "data.frame" are other possible values.

04/21/23 36Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 37: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

What is an object?• Any entity R

operates on is an object.– Vector

• Matrix• Array

– List• Data.frame

– Function– Expression

• Class of an object– Numeric– Complex– Character– Logical– list– Factor– Array– Matrix– Data.frame

• Mode of an object– Numeric– Complex– Character– Logical– list

Page 38: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Manipulating objects

• Vector assignmentconcatenate

04/21/23 38Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 39: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• If an expression is used as a complete command, the value is printed and lost.

04/21/23 39Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 40: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

40

Page 41: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

41

Page 42: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Vector arithmetic• Vectors can be used in arithmetic expressions, in

which case the operations are performed element by element.

• Vectors occurring in the same expression need not all be of the same length. If they are not, the value of the expression is a vector with the same length as the longest vector which occurs in the expression. Shorter vectors in the expression are recycled as often as need be (perhaps fractionally) until they match the length of the longest vector. In particular a constant is simply repeated.

04/21/23 42Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 43: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Example

• Arithmetic operations+, - , * , / , ^ (power), round, floor, ceiling

• Arithmetic functions– log, exp, sin, cos, tan, sqrt, abs

04/21/23 43Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 44: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Statistical functions– min, max, range, length, sum, mean, median– quantile, var, prod, smmary– sort, order, rank

04/21/23 44Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 45: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Sort y with respect to increasing order of x.

Same as sort(x)

04/21/23 45Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 46: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Logical vectors• As well as numerical vectors, R allows manipulation

of logical quantities. The elements of a logical vector can have the values TRUE, FALSE, and NA (for “not available”).

• Logical vectors are generated by conditions.• The logical operators are <, <=, >, >=, == for exact

equality and != for inequality. In addition if c1 and c2 are logical expressions, then c1 & c2 is their intersection (“and”), c1 | c2 is their union (“or”), and !c1 is the negation of c1.

04/21/23 46Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 47: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Example

Why?

04/21/23 47Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 48: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Missing values

• NA – not available• NaN – not a number• The function is.na(x) gives a logical vector of

the same size as x with value TRUE if and only if the corresponding element in x is NA and NaN.

• The finction is.nan(x) returns TRUE if and only if the corresponding element is NaN.

04/21/23 48Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 49: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Removing the missing values

04/21/23 49Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 50: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Character vectors• Character vectors are used frequently in R, for

example as plot labels. Where needed they are denoted by a sequence of characters delimited by the double quote character, e.g., "x-values", "New iteration results".

• The paste() function takes an arbitrary number of arguments and concatenates them one by one into character strings. Any numbers given among the arguments are coerced into character strings in the evident way, that is, in the same way they would be if they were printed.

04/21/23 50Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 51: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• The arguments are by default separated in the result by a single blank character, but this can be changed by the named parameter, sep=string, which changes it to string, possibly empty.

04/21/23 51Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 52: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Set operations

• union(x, y) • intersect(x, y) • setdiff(x, y) • is.element(el, set)

04/21/23 52Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 53: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 53Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 54: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Selecting and modifying subsets of an object using index vectors

• Subsets of a vector may be selected by appending to the name of the vector an index vector in square brackets, v[i].

• Such index vectors can be any of four distinct types:– A logical vector

04/21/23 54Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 55: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

– A vector of positive integer quantities– A vector of negative integer quantities

Such an index vector specifies the values to be excluded rather than included.

04/21/23 55Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 56: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

– A vector of character strings. This possibility only applies where an object has a names attribute to identify its components.

04/21/23 56Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 57: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• An indexed expression can also appear on the receiving end of an assignment, in which case the assignment operation is performed only on those elements of the vector.

04/21/23 57Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 58: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Arrays and Matrices

• An array can be considered as a multiply subscripted collection of data entries.

• A dimension vector is a vector of non-negative integers. If its length is k then the array is k-dimensional, e.g. a matrix is a 2-dimensional array.

04/21/23 58Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 59: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• A vector can be used by R as an array only if it has a dimension vector as its dim attribute. Suppose, for example, z is a vector of 1500 elements. The assignment > dim(z) = c(3,5,100)gives it the dim attribute that allows it to be treated as a 3 by 5 by 100 array.

04/21/23 59Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 60: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Matrices and arrays are actually vectors, too. They merely have extra class attributes.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

60

Page 61: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

61

Page 62: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Creating a matrix– Using dim– Using matrix

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

62

Page 63: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

63

Page 64: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• The values in the data vector give the values in the array in the same order as they would occur in FORTRAN, that is “column major order,” with the first subscript moving fastest and the last subscript slowest.

• For example if the dimension vector for an array, say a, is c(3,4,2) then there are 24 entries in a and the data vector holds them in the order a[1,1,1], a[2,1,1], ..., a[2,4,2], a[3,4,2].

04/21/23 64Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 65: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 65Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

X[2,3,2] = ?

X[18] = ?

Page 66: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Array indexingSubsections of an array

• Individual elements of an array may be referenced by giving the name of the array followed by the subscripts in square brackets, separated by commas.

• More generally, subsections of an array may be specified by giving a sequence of index vectors in place of subscripts; however if any index position is given an empty index vector, then the full range of that subscript is taken.

04/21/23 66Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 67: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 67Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 68: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Index matrices

• A matrix may be used with a single index matrix in order either to assign a vector of quantities to an irregular collection of elements in the array, or to extract an irregular collection as a vector.

• Example

04/21/23 68Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 69: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

69

An index matrix

Page 70: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

List• An R list is an object consisting of an ordered

collection of objects known as its components.• There is no particular need for the components

to be of the same mode or type, and, for example, a list could consist of a numeric vector, a logical value, a matrix, a complex vector, a character array, a function, and so on.

• If Lst is a list, then the function length(Lst) gives the number of (top level) components it has.

• New lists may be formed from existing objects by the function list().

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

70> Lst <- list(name_1=object_1, ..., name_m=object_m)

Page 71: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Example of a list object

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

71

They are different.

Page 72: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

72

Page 73: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

73

x=read.csv("20110823_Flow_Series.csv",sep=",")n.event=3event.comb=c()for (i in 1:n.event) event.comb=c(event.comb,x[[i]][!is.na(x[i])]) # combining all events into one series.

Page 74: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• Although the internal storage of a matrix is in column-major order, you can set the “byrow” argument in matrix() to TRUE to indicate that the data is coming in row-major order.

• An EXCEL csv file (number.csv) is as follows:

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

74

Page 75: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

75

Page 76: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Comparing scan and read.csv

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

76

Page 77: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

77

Note that mode of z is “list”.

Page 78: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Control statements

• Conditional execution: if statements– > if (expr_1) expr_2 else expr_3

• Repetitive execution: for loops, repeat and while– for loop construction• > for (name in expr_1) expr_2

– Other looping facilities• > repeat expr• > while (condition) expr

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

78

Page 79: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

• The break statement can be used to terminate any loop, possibly abnormally. This is the only way to terminate repeat loops.

• The next statement can be used to discontinue one particular cycle and skip to the “next”.

• Note that the if statement is NOT a loop operation, but the while statement yield a loop operation.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

79

Page 80: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

if

• If statements operate on length-one logical vectors.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

80

Page 81: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

ifelse

• Ifelse statements operate on vectors of variable length.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

81

Page 82: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Loops

• The most commonly used loop structures in R are for, while and apply loops. Less common are repeat loops.

• The break function is used to break out of loops, and next halts the processing of the current iteration and advances the looping index.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

82

Page 83: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

for

• For loops are controlled by a looping vector. In every iteration of the loop one value in the looping vector is assigned to a variable that can be used in the statements of the body of the loop. Usually, the number of loop iterations is defined by the number of values stored in the looping vector and they are processed in the same order as they are stored in the looping vector.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

83

Page 84: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

84

http://manuals.bioinformatics.ucr.edu/home/programming-in-r#TOC-While-Loop

Page 85: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

while

• Similar to for loop, but the iterations are controlled by a conditional statement.

• Syntax– while(condition) statements

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

85

Page 86: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Try the following commands:> qi=0;Q=10> if(Q>6) qi=1 else qi=0

> qi=0;Q=10> while(Q>6) qi=1

> qi=0;Q=10> while(Q>6) {qi=1;break}

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

86

For while loops, a break or a change in the state of the condition is needed to exit the loop.

Page 87: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

repeat

• Syntaxrepeat statements

• Loop is repeated until a break is specified. This means there needs to be a second statement to test whether or not to break from the loop.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

87

Page 88: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

apply Loop Family

• For Two-Dimensional Data Sets: apply• For Ragged Arrays: tapply• For Vectors and Lists: lapply and sapply

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

88

Page 89: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

apply

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

89

Page 90: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

More examples of apply

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

90

Page 91: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

tapply• Applies a function to array categories of

variable lengths (ragged array). Grouping is defined by factor.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

91

Page 92: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

lapply and sapply• Both apply a function to vector or list objects.

The function lapply returns a list, while sapply attempts to return the simplest data object, such as vector or matrix instead of list.

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

92

Page 93: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Engineering National Taiwan University References: 1. Venables, W.N., Smith,

Rounding errors

04/21/23 Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

93