112
Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Urfist Moritz M¨ uller, Maˆ ıtre de Conf´ erence at FSEG, Universit’e de Strasbourg 05.-06.12.2019 1 / 109

Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Introduction to RUrfist

Moritz Muller, Maıtre de Conference at FSEG, Universit’e deStrasbourg

05.-06.12.2019

1 / 109

Page 2: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Introduction

Data types and structures

Control structures

Data Handling

Basic Analysis

2 / 109

Page 3: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewIntroduction

What is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

3 / 109

Page 4: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

References

Shorted list

• Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statisticswith S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0(downloadable)

• Uwe Ligges (2009): Programmieren mit R, 3.,Uberarb. u.erweiterte Auflage, Springer Verlag, Heidelberg.

• Venables, W.N., Smith, D.M., and the R core team (2015) AnIntroduction to R. Notes on R: A Programming Environment forData Analysis and Graphics. Version 3.2.0 (2015-04-16). Downloadhttp://cran.r-project.org/doc/manuals/R-intro.pdf

• Daalgard, P (2008) Introductory Statistics with R. Second Edition.Springer Science+Business Media LLC, New York. Downloadhttp://www.academia.dk/BiologiskAntropologi/Epidemiologi/PDF/Introductory Statistics with R 2nd ed.pdf.

4 / 109

Page 5: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Why (not) use R?

5 / 109

Page 6: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

What is R?

What is R?Object oriented, interpreted, language and programmingenvironment for Data analysis and graphics.

• huge collection of tools for statistics and data analysis

• a language for expressing statistical models and tools

• graphical facilities

• effective object-oriented programming language that can beextended

6 / 109

Page 7: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Short History of R

Short history

1976 S language developed by Bell Labs at AT&T

1988 S-PLUS, commercial version of S (Statistical Sciences Inc.)

1995 R, open source (GPL) version of S (by Ross Ihaka and RobertGentleman)

1997 R Development Core Team (in short: R Core Team)

1998 Comprehensive R Archive Network (CRAN) founded

2000 R-1.0.0 first version compatible with S

2001 R News journal appears

2004 R version R-2.0.0 (S4 methods in package methods)

2015 R consortium (industry support)

7 / 109

Page 8: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Short History of R - submitted packages over time1

1Gergely Daroczi, source: https://gist.github.com/daroczig8 / 109

Page 9: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Why (not) use R?

Interpreted language

• programming on the fly

• flexible handling

• slower than a compiled language such as C, C++ (manybuilt-in functions are in C)

9 / 109

Page 10: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Why (not) use R?

Open Source

• No black box (in principle)

• New methods early available

• MCMC with Stan (https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started),

• Webscraping with rSelenium(https://www.seleniumhq.org/projects/webdriver/),

• Deep learning with Keras(https://www.tensorflow.org/guide/keras),

• Natural language processing with WordNet(https://wordnet.princeton.edu/), and many more.

• Support from the community

10 / 109

Page 11: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Why (not) use R?

Open Source & Programming needed

• No central authority

• No GUI (Graphical User Interface)

• e.g. offered by SPSS, SAS

Do you want to travel all-inclusive or as back-packer?

11 / 109

Page 12: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Why (not) use R?

Open Source & Programming needed

• No central authority

• No GUI (Graphical User Interface)

• e.g. offered by SPSS, SAS

Do you want to travel all-inclusive or as back-packer?

11 / 109

Page 13: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Today’s topicsIntroduction

What is R?First stepsExamples

Data types and structuresData typesVectorMatrixListsData.frames

Control structuresConditional expressionsLoops

Data HandlingRead/WriteData cleaning/formattingData Merge/Selection

Basic AnalysisPlottingStatisticsBeyond this course

12 / 109

Page 14: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewIntroduction

What is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

13 / 109

Page 15: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Installation on your own device:

Install R

• go to http://www.r-project.org/

• Download R for Windows / Mac / Linux

• Install ‘Base distribution’ with default settings

Install RStudio

• go to http://www.rstudio.org/

• Download and install with default

14 / 109

Page 16: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Start R Studio

Explain R vs. ‘R Studio’. Small tour through RStudio.

15 / 109

Page 17: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R - Calculator

Math expressions

> 1 + 2 * 3

[1] 7

> 2 * 5^2 - 10 * 5 # a comment

[1] 0

> 4 * sin(pi / 2)

[1] 4

> 0 / 0 # not defined (Not a Number)

[1] NaN

16 / 109

Page 18: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Arithmetic Operators, Functions, Values2

Operators, Functions, Values Description

∗, / multiplication, division+, - addition, subtraction%% modulo

max(), min() extremaabs() absolute valueround(), floor(), ceiling() round (up, down)sum(), prod() sum, productlog() logarithmsin(), cos() sinus, cosinus

pi the value of piInf, -Inf infinityNaN not defined (Not a Number)NA Not AvailableNULL empty set

2See R Development Core Team (2008): R: A Language and Environmentfor Statistical Computing. R Foundation For Statistical Computing, Vienna,Austria. URL http://www.R-project.org.

17 / 109

Page 19: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Assignments

How to ‘remember’ calculations

> x1 <- 3.25 # assign the value 3.25 to object x1

> x1

[1] 3.25

# please don’t use ‘=’ or ‘->’ for assignments

# <<- ‘special’ assignment - kept for later

Declaration and initialisation at once.

18 / 109

Page 20: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Assignments

How to ‘remember’ calculations

> x1 <- 3.25 # assign the value 3.25 to object x1

> x1

[1] 3.25

# please don’t use ‘=’ or ‘->’ for assignments

# <<- ‘special’ assignment - kept for later

Declaration and initialisation at once.

18 / 109

Page 21: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Print function - print()

What appears on screen? Try

> x1 <- 3.25

> (x1 <- 3.25)

> print(x1 <- 3.25)

> x1

19 / 109

Page 22: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Objects

R is an object oriented language

• Everything in R is an object (data, functions, etc.),

• objects have names (start with letter, upper-/lower-case),

• every object has a length(),

• every object is of a certain class - class(),

• for each class, there are special (generic) functions (e.g.print()),

• every object has a mode (data type) - mode(),

• an object may have attributes - attributes(),

• classes may inherit from other classes.

20 / 109

Page 23: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Find Help

Integrated help

• help.start()

• help(”cor”), ?cor

• help.search(”correlation”), ??correlation

21 / 109

Page 24: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Find Help

Internet

• check r-cran

• cran taskview

• cran discussions/mailing lists/examples

• manuals / vignettes

• R news/journal

• search for snippets - anything

22 / 109

Page 25: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Find Help

Books

• see literature first slide and many more.

23 / 109

Page 26: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewIntroduction

What is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

24 / 109

Page 27: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Faithful - example

25 / 109

Page 28: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Workspace and working directory

Workspace - save what?

• workspace keeps your R-objects

• workspace is located in your RAM

• different environments in your workspace

26 / 109

Page 29: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Workspace and working directory

Workspace - try out

> ls()

?

> search()

?

> ls(name="package:datasets")

?

27 / 109

Page 30: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Workspace and working directory

Working directory - default folder for save/load etc. - try out

> getwd()

[1] "/Users/moritz"

> setwd("/Users/moritz/Documents/Teaching

/2019_2020/Urfist")

28 / 109

Page 31: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamples

Data types and structuresData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

29 / 109

Page 32: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data types - mode()

Description Example Data type

Basic data types (hierarchical)

empty set NULL NULLboolean TRUE logical

integers & reals 3.14 numericcomplex numbers 2.13+1i complex

characters and strings ”Hello” character

Composite data type

factors blue & red factor

30 / 109

Page 33: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Logical values in R

Try and explain

> 4 < 3

> (3+1) !=3

> -3<-2

> -3 < -2

> (3 >= 2) & (4 == (3+1))

> c(TRUE, FALSE) & c(TRUE, TRUE)

> x <- c(-4, -5, -1, 0, 2, 4, 5)

> x > 0

> sum(x > 0)

31 / 109

Page 34: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Logical values in R

Function, Operator, Value Description

==, != equal, not equal>, >= greater, greater equal<, <= smaller, smaller equal! not

&, && and (vector, non-vector)|, || or (vector, non-vector)xor() exclusive or

TRUE, FALSE true, false

32 / 109

Page 35: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Logical values in R

Try - any(), all(), which()

> x <- c(2, 0, 3, 1, 6, 9, 7)

> someTrue <- x > 4

> allTrue <- x >= 0

> any(someTrue)

> all(allTrue)

> any(allTrue)

> all(someTrue)

> !any(allTrue)

> all(!someTrue)

> all(!allTrue)

> which(someTrue)

> which(!someTrue) 33 / 109

Page 36: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Logical values in R

Not available - NA - conversion into other data types

> x <- NA

> mode(x)

>

> y <- c(3, x)

> mode(y)

34 / 109

Page 37: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Logical values in R

Not available - NA - is special

> # convert NA

> x <- NA

> mode(x)

> y <- c(3, x)

> mode(y)

> # comparison and calculation

> # - (if a part is unknown - result will be unknown)

> y > 2

> sum(y)

> # do not consider (remove) NA elements

> sum(y, na.rm=TRUE)

35 / 109

Page 38: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Numeric values in R

Try - mode(), typeof()

> (x <- pi)

[1] 3.141593

> mode(x)

[1] "numeric"

> typeof(x)

[1] "double"

> (y <- as.integer(x)) # information loss

[1] 3

> typeof(y)

[1] "integer"

36 / 109

Page 39: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Numeric values in R

Try - conversion

> is.character(y)

[1] FALSE

> x <- -1

> -1^0.5

[1] -1

> sqrt(as.complex(x))

[1] 0+1i

> (z <- as.character(y))

[1] "3"

> as.numeric("z")

[1] NA

Warning message:

NAs introduced by coercion

> (y <- as.numeric("-1"))

[1] -1

> y*5

[1] -5

37 / 109

Page 40: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Character values in R

There are just strings. Manipulating them is another story. Nextcourse.

"hello world"

a <- "hello world"

38 / 109

Page 41: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Factors in R

(ordered) factors

gulliver <- factor(c("dwarf","giant", "giant",

"giant","dwarf"))

gulliver <- factor(c("dwarf","giant", "giant",

"giant","dwarf"),

levels=c("giant","dwarf","gulliver"))

gulliver <- ordered(c("dwarf","giant", "giant",

"giant","dwarf"),

levels=c("dwarf","gulliver","giant"))

39 / 109

Page 42: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Factors in R

(ordered) factors

> class(gulliver)

[1] "ordered" "factor"

> str(gulliver)

Ord.factor w/ 3 levels "dwarf"<"gulliver"<..: 1 3 3 3 1

> mode(gulliver)

[1] "numeric"

> length(gulliver)

[1] 5

40 / 109

Page 43: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamples

Data types and structuresData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

41 / 109

Page 44: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data structures

Data structure is the representation of data, which defines how weoperate on the data.

Data structures in R

• vector

• matrix

• array

• list

• data.frame

For each data structure you have a function to declare & initialisean object, e.g. vector() .

42 / 109

Page 45: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

characteristics of vectors

> x <- c(4.1, 4.3, 2, 54, 34) # c() for concatenate

>

> class(x)

[1] "numeric"

> mode(x)

[1] "numeric"

> typeof(x)

[1] "double"

> str(x)

num [1:5] 4.1 4.3 2 54 34

> is.vector(x)

[1] TRUE

> length(x)

[1] 5

> attributes(x)

NULL43 / 109

Page 46: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

characteristics of vectors

> # Only of one data type

> c(x, FALSE, TRUE)

[1] 4.1 4.3 2.0 54.0 34.0 0.0 1.0

> c(x, FALSE, TRUE, "Hello")

[1] "4.1" "4.3" "2" "54" "34" "FALSE" "TRUE" "Hello"

44 / 109

Page 47: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

characteristics of vectors

> x <- c(height=10, width=4)

> x

height width

10 4

> attributes(x)

$names

[1] "height" "width"

> names(x)

[1] "height" "width"

45 / 109

Page 48: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

sequences

> 2:5

[1] 2 3 4 5

> seq(2,5,by=2)

[1] 2 4

> rep(2:4,times=3)

[1] 2 3 4 2 3 4 2 3 4

46 / 109

Page 49: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

Calculate with vectors

> x <- c(2, 4, 8)

> x*2

[1] 4 8 16

> x-3

[1] -1 1 5

>

> x * c(3,2,1)

[1] 6 8 8

> x * c(3,2)

[1] 6 8 24

Warning message:

In x * c(3, 2) :

longer object length is not a multiple of shorter object length

47 / 109

Page 50: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

Calculate with vectors

> t(2:4) %*% 1:3

[,1]

[1,] 20

> 2:4 %*% 1:3

[,1]

[1,] 20

> 2:4 %*% t(1:3)

[,1] [,2] [,3]

[1,] 2 4 6

[2,] 3 6 9

[3,] 4 8 12

48 / 109

Page 51: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

Indexing vectors

> x

[1] 2 4 8

> x[2] # 2nd element by position

[1] 4

> x[c(FALSE,TRUE,FALSE)] # 2nd element with logical value

[1] 4

> names(x) <- c("Height","Length","Width")

> x["Length"] # access by name

Length

4

> x[-2] # without 2nd element

[1] 2 8

49 / 109

Page 52: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

Indexing vectors

# vector x

x <- c(2, 4, 8)

# extract by boolean vector

idx <- x < 5

print(idx)

x[idx]

# overwrite

x[idx] <- 2

# extract by position vector

pos <- which(x < 5)

print(pos)

x[pos]

# overwrite

x[pos] <- 150 / 109

Page 53: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Vector

Indexing vectors

> x[3,2,1] # reverse order?

Error in x[3, 2, 1] : incorrect number of dimensions

> x[c(3,2,1)] # reverse order!

[1] 8 4 2

> x[] <- -2 # replace every element by -2

> x

[1] -2 -2 -2

> x <- -2 # overwrite x by -2

> x

[1] -2

51 / 109

Page 54: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamples

Data types and structuresData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

52 / 109

Page 55: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Matrix

initialise a Matrix - try and explain

x <- 1:3

y <- 4:6

cbind(x,y)

rbind(x,y)

matrix(1:6, nrow=3, ncol=2)

matrix(1:6, nrow=3, byrow=TRUE)

matrix(1:6, ncol=2, byrow=TRUE)

matrix(1:6, nrow=3, ncol=4, byrow=TRUE)

53 / 109

Page 56: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Matrix

Operate on matrices - try and explain

A <- matrix(c(1,1,2,2), nrow=2, byrow=TRUE)

B <- t(A) # transpose

solve(A) # inverse

# elementwise operations

A * B

A / B

A - B

A^B

A == B

A == t(B)

any(A==B)

# matrix multiplication

A %*% B54 / 109

Page 57: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Matrix

Access matrix elements - exercise

>X <- matrix(1:6, nrow=3)

>rownames(X) <- c("length", "width", "height")

>colnames(X) <- c("shelf","chair")

• read length of shelf by [row pos., column pos.]

• read length of shelf by [row name, column name]

• read length of shelf by [row name, column pos.]

• overwrite length of shelf with 14

• get all dimensions of shelf

• get length of all mobiliar

• overwrite length of mobiliar by prior length times 10055 / 109

Page 58: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Matrix

Matrix features

> class(X)

[1] "matrix"

> typeof(X)

[1] "integer"

> mode(X)

[1] "numeric"

56 / 109

Page 59: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Matrix

Matrix features

> attributes(X)

$dim

[1] 3 2

$dimnames

$dimnames[[1]]

[1] "length" "width" "height"

$dimnames[[2]]

[1] "table" "chair"

57 / 109

Page 60: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Matrix

Matrix features

> str(X)

int [1:3, 1:2] 1 2 3 4 5 6

- attr(*, "dimnames")=List of 2

..$ : chr [1:3] "length" "width" "height"

..$ : chr [1:2] "table" "chair"

Matrices are special cases of vectors in R (the opposite in math ofcourse).

58 / 109

Page 61: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Array

Array - like matrix but possibly more dimensions

Avoid arrays.Don’t use arrays with more than 3 dimensions if not absolutelynecessary.Use map of array.

59 / 109

Page 62: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamples

Data types and structuresData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

60 / 109

Page 63: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

List

List - features

• very flexible,

• may contain different data types,

• also lists (recursive)

61 / 109

Page 64: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

List

List - features

> (L1 <- list("this is a", 100, matrix(1:4,nrow=2),

list(TRUE,FALSE)))

[[1]]

[1] "this is a"

[[2]]

[1] 100

[[3]]

[,1] [,2]

[1,] 1 3

[2,] 2 4

[[4]]

[[4]][[1]]

[1] TRUE

[[4]][[2]]

[1] FALSE

62 / 109

Page 65: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

List

List - features

> (L1 <- list(a="this is a", b=100, c=matrix(1:4,nrow=2),

truth=list(TRUE,FALSE)))

$a

[1] "this is a"

$b

[1] 100

$c

[,1] [,2]

[1,] 1 3

[2,] 2 4

$truth

$truth[[1]]

[1] TRUE

$truth[[2]]

[1] FALSE

63 / 109

Page 66: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

List

List - features

> L1[[1]] # access by position

[1] "this is a"

> L1[["a"]] # access by name

[1] "this is a"

>

> L1$b # access by name (but different)

[1] 100

64 / 109

Page 67: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

List

List - features

> L1[[1]] <- NULL

> L1

$b

[1] 100

$c

[,1] [,2]

[1,] 1 3

[2,] 2 4

$truth

$truth[[1]]

[1] TRUE

$truth[[2]]

[1] FALSE65 / 109

Page 68: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamples

Data types and structuresData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

66 / 109

Page 69: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

data.frames

data.frames

• the most typical structure for data sets

• data.frames are lists - but entries are vectors with same length

• data.frame()

67 / 109

Page 70: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data.frame

initialise Data.frame

> shopping <- data.frame(Product=c("cheese","wine","bread"),

Unit=c("grams","bottles","loaf"), Amount=c(300,2,2))

>

> str(shopping)

’data.frame’: 3 obs. of 3 variables:

$ Product: Factor w/ 3 levels "bread","cheese",..: 2 3 1

$ Unit : Factor w/ 3 levels "bottles","grams",..: 2 1 3

$ Amount : num 300 2 2

68 / 109

Page 71: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data.frame

Data.frame - access elements

> shopping[2,]

Product Unit Amount

2 wine bottles 2

> shopping[2,3]

[1] 2

> shopping[2,"Amount"]

[1] 2

> shopping$Amount[2]

[1] 2

69 / 109

Page 72: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data.frame

Plot two regimes of old faithful

40 50 60 70 80 90 100

12

34

56

waiting

eruptions

70 / 109

Page 73: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data.frame

Plot two regimes of old faithful (by subsetting data.frame)

# plot geysir data

plot(NA, xlim=c(40,100),ylim=c(1,6), xlab="waiting",

ylab="eruptions")

points(regime1[,"waiting"],regime1[,"eruptions"],col="red")

points(regime2[,"waiting"],regime2[,"eruptions"],col="blue")

points(rest[,"waiting"],rest[,"eruptions"],col="grey")

lines(x=c(70,70),y=c(0,7),lty=3)

lines(x=c(40,100),y=c(3.5,3.5),lty=3)

71 / 109

Page 74: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data.frame

Plot two regimes of old faithful (by adding variables todata.frame)

# plot geysir data

plot(faithful$waiting,faithful$eruptions, col=faithful$color,

xlim=c(40,100),ylim=c(1,6),

xlab="waiting", ylab="eruptions")

lines(x=c(70,70),y=c(0,7),lty=3)

lines(x=c(40,100),y=c(3.5,3.5),lty=3)

72 / 109

Page 75: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamplesData typesVectorMatrixListsData.frames

Control structuresConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

73 / 109

Page 76: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Conditional expressions

If...else

> x <- 1

> if(x > 0){

+ print("x is pos.")

+ }else{

+ print("x is neg.")

+ }

[1] "x is pos."

74 / 109

Page 77: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Conditional expressions

Ifelse

# vector-oriented, use for simple expr.

> x <- c(1,2,3,4)

> ifelse(x > 2, "A", "B")

[1] "B" "B" "A" "A"

75 / 109

Page 78: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Conditional expressions

switch

> switch(2, a=11, b=12, cc=13, d=14)

[1] 12

> (switch("c", a=11, b=12, cc=13, d=14))

NULL

> switch("cc", a=11, b=12, cc=13, d=14)

[1] 13

76 / 109

Page 79: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamplesData typesVectorMatrixListsData.frames

Control structuresConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

77 / 109

Page 80: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Loops

for()

> for(i in 1:3){

+ print(i)

+ }

[1] 1

[1] 2

[1] 3

78 / 109

Page 81: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Loops

repeat

> i <- 0

> repeat{

+ i <- i+1

+ if(i < 3){

+ next # start next turn

+ }

+ print(i)

+

+ if(i == 3){

+ break # exit loop

+ }

+

+ }

[1] 379 / 109

Page 82: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Loops

while

> i <- 0

> while(i < 3){

+ print(i)

+ i <- i + 1

+ }

[1] 0

[1] 1

[1] 2

80 / 109

Page 83: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Loops in R

Short note on loops and efficiency

• R is executed by an interpreter (not compiled). Interpreterneeds to translate each loop into machine code.

• Vectorized code is typically faster (once translated, fetch,calculate write of each value ‘compressed’).

• maybe slower (if the size of the vector becomes too large),

• maybe difficult to read (if several objects are handled),

• maybe not possible (if sequential combination).

81 / 109

Page 84: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Excercise

• Exercise ‘R shaked’

• Exercise ‘Normal Distribution uncommented’

82 / 109

Page 85: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoops

Data HandlingRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

83 / 109

Page 86: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Read/Write

Import/export, read/write

• R objects (load(), save())

• text-files (read.table() / write.table(), scan() / cat())

• operate on excel/access/data bases (e.g. RODBC, orRMySQL)

• SAS/SPSS/Stata (e.g. foreign)

See “R data import / export”.

84 / 109

Page 87: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoops

Data HandlingRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

85 / 109

Page 88: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Character string handling

Function Description

scan() / cat() Read from / print to file (or console)parse() / deparse() convert char into expression (and vice-versa)grep() regular expressionsnchar() number of chars in a stringstrsplit() / paste() (dis-)connect strings

See e.g. Sanchez, G. (2013) Handling and Processing Strings in RTrowchez Editions. Berkeley, 2013.http://www.gastonsanchez.com/Handling and Processing Stringsin R.pdf

86 / 109

Page 89: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoops

Data HandlingRead/WriteData cleaning/formattingData Merge/SelectionPlottingStatisticsBeyond this course

87 / 109

Page 90: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data Merge/Selection

How to join and split data.frames?

• merge(),

• rbind()

• cbind()

• subset()

• idx < − x=y; df[idx,c(”A”,”B”)]

Note: Merge/split of very large tables (millions of entries) better ina data base.

88 / 109

Page 91: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Data Merge/Selection

How to join and split data.frames?

• merge(),

• rbind()

• cbind()

• subset()

• idx < − x=y; df[idx,c(”A”,”B”)]

Note: Merge/split of very large tables (millions of entries) better ina data base.

88 / 109

Page 92: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/Selection

Basic AnalysisPlottingStatisticsBeyond this course

89 / 109

Page 93: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting

How to make good graphs

1. Excel plots

2. R plots - various devices - pdf, ps etc.

3. TikZ plots (vector graphics in LaTeX) - TikzDevice - NEW

4. Plotly - interactive web-based graphs via the open sourceJavaScript graphing library NEWER

90 / 109

Page 94: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting

Devices

• Device defines where to plot, similar to paper, pergament etc.

• the same graph gets different ‘look and feels’ on differentdevices

• (transparent) colours differ by device

91 / 109

Page 95: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting - pdf-device

●●

● ● ●

●●

●●

●●●● ●●● ●

●● ●●●

2000 2003 2006 2009

1020

30Regional spread of Greek GDP

Year

k E

uro

per

capi

ta

92 / 109

Page 96: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Scientific plotting - tikzDevice

2000 2002 2004 2006 2008 2010

1015

2025

30

Regional spread of Greek GDP

Year

kE

uro

per

cap

ita

93 / 109

Page 97: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting

Typical processStep 0: try out on the default device.

f <- paste(getwd(),"/GreekGDP.pdf",sep="") # define file

pdf(file=f, width=4, height=3) # open device

# plot into device

boxplot(gdp2[idx,"EUR_HAB"]/1000~gdp2[idx,"TIME"],notch=TRUE, xlab="Year", ylab="k Euro per capita",

main="Regional spread of Greek GDP")

dev.off() # close device

94 / 109

Page 98: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting

Plot functions

• plot() - context dependent / generic function - depend on theclass to plot.

• add - points(), lines(), text() in the plot.

• barplot(), boxplot(), contour(), hist(), etc.

• plot functions from special packages

95 / 109

Page 99: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting

• options in the function, e.g. cex

• options through par()

96 / 109

Page 100: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting

Arguments frequently used in plot functions and par()Argument Description

axes (not) draw axes - later fine tunedcex size (multiply) of nodes and letterslty, lwd line type, line widthpch point symbolxlab, ylab axes labelsxlim, ylim axes dimensions

97 / 109

Page 101: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting - regions

par()-ameters oma/mar3

3http://rgraphics.limnology.wisc.edu/rmargins sf.php98 / 109

Page 102: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting - regions

par()-ameters mfcol/mfrow4

4http://rgraphics.limnology.wisc.edu/rmargins mfcol.php99 / 109

Page 103: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

R plotting

General remarks

• Everything can be done,

• But may cost time to figure out how, and to code.

100 / 109

Page 104: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/Selection

Basic AnalysisPlottingStatisticsBeyond this course

101 / 109

Page 105: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Descriptives

Checkoutsummary(), table(), prop.table(), cor(), rcorr(), etc.

102 / 109

Page 106: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Descriptives - free work

What’s the chance of a happy end? Use the ‘Titanic’ data inthe library ‘datasets’.

103 / 109

Page 107: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Descriptives - free work

Bring your project to the class room

YOUR PROJECT HERE.

104 / 109

Page 108: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

OverviewWhat is R?First stepsExamplesData typesVectorMatrixListsData.framesConditional expressionsLoopsRead/WriteData cleaning/formattingData Merge/Selection

Basic AnalysisPlottingStatisticsBeyond this course

105 / 109

Page 109: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Beyond this course

106 / 109

Page 110: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Inductive stats

Checkoutlm(), glm(), task view

107 / 109

Page 111: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Non-parametrics

Checkoutdensity() in R base, np-package, task view

108 / 109

Page 112: Introduction to R - Urfist · Introduction Data types and structures Control structures Data Handling Basic Analysis Introduction to R Ur st Moritz Mul ler, Ma^ tre de Conf erence

Introduction Data types and structures Control structures Data Handling Basic Analysis

Bayesian

CheckoutrStan package, task view

109 / 109