248
Taking R on limit Kutergin Alex Perm State University, MiFIT 16 october 2012 Kutergin A. High performance computing with R

HPC in R

Embed Size (px)

Citation preview

Page 1: HPC in R

Taking R on limit

Kutergin Alex

Perm State University, MiFIT

16 october 2012

Kutergin A. High performance computing with R

Page 2: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 3: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 4: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 5: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 6: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 7: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 8: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 9: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 10: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 11: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 12: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 13: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 14: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 15: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 16: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 17: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 18: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 19: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 20: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 21: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 22: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 23: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 24: HPC in R

Outline

1 General words about R2 Motivation and scope3 The basic ways of speeding up the R-code4 The special way of speeding up the R-code: package pnmath5 Problem of data splitting: package iterator6 Parallel computation with R: high-level parallelism (packages:

parallel, snow and additional packages)7 Parallel computation with R: low-level parallelism (package: Rmpi)8 Parallel computation with R: parallel execution of for-loops

(package: foreach)9 Parallel computation with R: parallel computation with graphical

processing unit (package: gputools)10 Working with vary large datasets: package filehash and package

bigmemory11 Final words, some useful references and contacts

Kutergin A. High performance computing with R

Page 25: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 26: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 27: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 28: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 29: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 30: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 31: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 32: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 33: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 34: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 35: HPC in R

General words about R

R softwareR is free powerful software for data analysis and statistical computing. R- console application with its own programming language running ininterpreter mode. Lack of sophisticated GUI provides a number ofadvantages:

there is no need to learn which algorithm is behind each buttonyou can just learn the basic principles of R-programming andeffectively solve complex problems using R-programming language

Download RR can be downloaded from following link:http://cran.r-project.org/

Project page: www.r-project.org

Kutergin A. High performance computing with R

Page 36: HPC in R

General words about RView of R work session

Kutergin A. High performance computing with R

Page 37: HPC in R

General words about Rpackages and information sources

There are two sources of happiness for R-programmer

Source of information Source of packages

Kutergin A. High performance computing with R

Page 38: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 39: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 40: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 41: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 42: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 43: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 44: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 45: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 46: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 47: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 48: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 49: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 50: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 51: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 52: HPC in R

Motivation and scope

MotivationComputers become more productive. Progress in computer’shardware and software is amazing. These computing power becameavailable even in a laptopConstantly increasing growth of data’s volume and the complexity ofproblems associated with data processingThe emergence of multi-core PCs and CUDA technology

Scope

We: simple students or not powerful guys. So we don’t havesupercomputerWe have Core i5 or Core i7 or another multi-core laptop or PC withsupport of CUDA technologyWe have some computational tasks and we want to solve them moreeffectively

Kutergin A. High performance computing with R

Page 53: HPC in R

The basic ways of speeding up the R-codeHow to check time of code’s execution?

First way to check time of code execution

#return CPU (and other) times that expr usedsystem . time ()

system . time ( sum ( runif (10000000)))

Second way to check time of code execution

#determines how much real and CPU time (in seconds) thecurrently running R process has already taken

proc . time ()

start _ time <- proc . time ()sum ( runif (10000000))end _ time <- proc . time () - start _ time

Kutergin A. High performance computing with R

Page 54: HPC in R

The basic ways of speeding up the R-codeHow to check time of code’s execution?

First way to check time of code execution

#return CPU (and other) times that expr usedsystem . time ()

system . time ( sum ( runif (10000000)))

Second way to check time of code execution

#determines how much real and CPU time (in seconds) thecurrently running R process has already taken

proc . time ()

start _ time <- proc . time ()sum ( runif (10000000))end _ time <- proc . time () - start _ time

Kutergin A. High performance computing with R

Page 55: HPC in R

The basic ways of speeding up the R-codeHow to check time of code’s execution?

First way to check time of code execution

#return CPU (and other) times that expr usedsystem . time ()

system . time ( sum ( runif (10000000)))

Second way to check time of code execution

#determines how much real and CPU time (in seconds) thecurrently running R process has already taken

proc . time ()

start _ time <- proc . time ()sum ( runif (10000000))end _ time <- proc . time () - start _ time

Kutergin A. High performance computing with R

Page 56: HPC in R

The basic ways of speeding up the R-codeHow to check time of code’s execution?

First way to check time of code execution

#return CPU (and other) times that expr usedsystem . time ()

system . time ( sum ( runif (10000000)))

Second way to check time of code execution

#determines how much real and CPU time (in seconds) thecurrently running R process has already taken

proc . time ()

start _ time <- proc . time ()sum ( runif (10000000))end _ time <- proc . time () - start _ time

Kutergin A. High performance computing with R

Page 57: HPC in R

The basic ways of speeding up the R-codeHow to check time of code’s execution?

First way to check time of code execution

#return CPU (and other) times that expr usedsystem . time ()

system . time ( sum ( runif (10000000)))

Second way to check time of code execution

#determines how much real and CPU time (in seconds) thecurrently running R process has already taken

proc . time ()

start _ time <- proc . time ()sum ( runif (10000000))end _ time <- proc . time () - start _ time

Kutergin A. High performance computing with R

Page 58: HPC in R

The basic ways of speeding up the R-codeHow to check time of code’s execution?

First way to check time of code execution

#return CPU (and other) times that expr usedsystem . time ()

system . time ( sum ( runif (10000000)))

Second way to check time of code execution

#determines how much real and CPU time (in seconds) thecurrently running R process has already taken

proc . time ()

start _ time <- proc . time ()sum ( runif (10000000))end _ time <- proc . time () - start _ time

Kutergin A. High performance computing with R

Page 59: HPC in R

The basic ways of speeding up the R-codeHow to check time of code’s execution?

First way to check time of code execution

#return CPU (and other) times that expr usedsystem . time ()

system . time ( sum ( runif (10000000)))

Second way to check time of code execution

#determines how much real and CPU time (in seconds) thecurrently running R process has already taken

proc . time ()

start _ time <- proc . time ()sum ( runif (10000000))end _ time <- proc . time () - start _ time

Kutergin A. High performance computing with R

Page 60: HPC in R

The basic ways of speeding up the R-codeAnalysis of the effectiveness of programs

Function’s profile

Let us compare work of universal function lm() and more specificfunction lm.fit()

#Loading some datasetdata (longley)#Recording profile to file lm.outRprof(" lm . out ")#Runnig lm() 1000 timesin vi si bl e ( re pl ic at e (1000, lm(Employed ~.-1, data

= longley)))#Switch off profilingRprof(NULL)

Kutergin A. High performance computing with R

Page 61: HPC in R

The basic ways of speeding up the R-codeAnalysis of the effectiveness of programs

#Preparing data for lm.fit()longleydm <- data . matrix ( data . frame (longley))#Recording profile to file lm.fit.outRprof(" lm . fit . out ")#Runnig lm.fit() 1000 timesin vi si bl e ( re pl ic at e (1000,

lm.fit(longleydm [,-7], longleydm [,7])))#Switch off profilingRprof(NULL)

#Results of profilingsummaryRprof(" lm . out ")\$ sampling. time[1] 3.12summaryRprof(" lm . fit . out ")\$ sampling. time[1] 0.18#What a difference!

Kutergin A. High performance computing with R

Page 62: HPC in R

The basic ways of speeding up the R-codeAnalysis of the effectiveness of programs

Package profr

This package allows you to visualize the results of profiling

library (" profr ")plot ( parse _ rprof(" lm . out "), main = " Profile ␣ of ␣

lm () ")plot ( parse _ rprof(" lm . fit . out "), main = " Profile ␣

of ␣ lm . fit () ")

Package proftools

This package allows you to visualize call graph for a function

library (" Rg ra ph viz "); library (" pr of to ols ")lmfitprod <- readProfileData(" lm . fit . out ")plotProfileCallGraph(lmfitprod)

Kutergin A. High performance computing with R

Page 63: HPC in R

The basic ways of speeding up the R-codeAnalysis of the effectiveness of programs

Kutergin A. High performance computing with R

Page 64: HPC in R

The basic ways of speeding up the R-codeAnalysis of the effectiveness of programs

Сall graph

Kutergin A. High performance computing with R

Page 65: HPC in R

The basic ways of speeding up the R-codeAnalysis of the effectiveness of programs

Another example of profiling:

its = 2500; dim = 1750X = matrix ( rnorm (its * dim ),its , dim )my.cross. prod <- function (X){

C = matrix (0, ncol (X), ncol (X))for (i in 1: nrow (X)){

C = C + X[i,] %o% X[i,]}return (C)

}library (proftools)C = my.cross. prod (X)C1 = t(X) %* % XC2 = cr os sp ro d (X)Rprof(NULL)print ( all . equal (C,C1 ,C2))

Kutergin A. High performance computing with R

Page 66: HPC in R

The basic ways of speeding up the R-codeAnalysis of the effectiveness of programs

Result:

library (proftools)profile . data <-

readProfileData(" matrix - mult . out ")flatProfile( profile . data )

/ total.pct total. time self.pct self. timemy.cross. prod 87.31 88.36 0.04 0.04+ 49.84 50.44 49.84 50.44%o% 37.37 37.82 0.00 0.00outer 37.37 37.82 37.27 37.72%* % 7.75 7.84 7.75 7.84cr os sp ro d 4.86 4.92 4.86 4.92t 0.16 0.16 0.06 0.06t. default 0.10 0.10 0.10 0.10matrix 0.06 0.06 0.06 0.06as. vector 0.02 0.02 0.02 0.02

Kutergin A. High performance computing with R

Page 67: HPC in R

The basic ways of speeding up the R-codeVectorization of code

Note!Loops in R are slow! You can speed up your code by using operation withvectors and matrix. It’s another style of programming, but you have touse it!

#Simple example of vectorization:#component-wise addition of two vectors#Generating some random data#First vectora <- rnorm (n = 10000000)#Second vectorb <- rnorm (n = 10000000)#Vector for resultx <- rep (0, length (a))

Kutergin A. High performance computing with R

Page 68: HPC in R

The basic ways of speeding up the R-codeVectorization of code

So, what about results?

#Slow waytime _1 <- system . time(

for (i in 1: length (a)){

x[i] <- a[i]+b[i]}

); time _ 1[3]36.97#Fast waytime _2 <- system . time (x <- a + b); time _ 2[3]0.04Acceleration <- time _ 1[3] / time _ 2[3]Acceleration924.25#That’s hot!!!!

Kutergin A. High performance computing with R

Page 69: HPC in R

The basic ways of speeding up the R-codeUsing magic of linear algebra

Using linear algebra operations

#Scalar product#Slow waystart <- proc . time ()res <- 0for (i in 1: length (a)){

res <- res + a[i]*b[i]}end <- proc . time () - start ; end [3]16.71#Fastsystem . time (a %* % b)[3]0.09#Even faster...system . time ( sum (a*b))[3]0.08

Kutergin A. High performance computing with R

Page 70: HPC in R

The basic ways of speeding up the R-codeUsing magic of linear algebra

Using linear algebra operations

#Matrix multiplication slow versionits <- 2500; dim <- 1750;X <- matrix ( rnorm (its * dim ),its , dim )X_ transp <- t(X)res <- array (NA , dim = c(1750, 1750))start <- proc . time ()for (i in 1: nrow (X_ transp)){

for (j in 1: ncol (X)){

res[i, j] <- sum (X_ transp[i,] *X[,j])}

}end <- proc . time () - start ; end [3]221.67

Kutergin A. High performance computing with R

Page 71: HPC in R

The basic ways of speeding up the R-codeUsing magic of linear algebra

Package BLAS

BLAS means: Basic Linear Algebra Subprogram. This package containsthe optimized algorithms for linear algebra operations and uses all coresof multi-core machine automatically.

#Matrix multiplication fast version#BLAS matrix multsystem . time (X_ transp %* % X)[3]7.77#Even faster...system . time ( cr os sp ro d (X))[3]4.98

Kutergin A. High performance computing with R

Page 72: HPC in R

The basic ways of speeding up the R-codeUsing build-in R-functions

Package base

You can find full list of build-in R-function in the documentation for thispackage

#Let us define a functionmySum <- function (N){

sumVal <- 0for (i in 1:N) {sumVal <- sumVal + i}return (sumVal)

}system . time (mySum (1000000))[3]0.62system . time ( sum ( as. numeric ( seq (1, 1000000))))[3]0.05

Kutergin A. High performance computing with R

Page 73: HPC in R

The basic ways of speeding up the R-codeUsing build-in R-functions

Why are build R-functions faster?

R programming language works in interpreter mode. This is always slowlythan using the compiled code. So, when you call build-in R-function, youcall optimized and compiled code. Also build-in functions are written inmore low-level programming language (like C/C++ or FORTRAN) andthis provides greater access to the capabilities of the hardware

Note!You can select data from vector, matrix, data.frame or array using somecondition that applies to row or column of data object. It’s fast andconvenient

#Extracting only positive values from first column of Xits <- 2500; dim <- 1750;X <- matrix ( rnorm (its * dim ),its , dim )X[X[,1]>0, 1]

Kutergin A. High performance computing with R

Page 74: HPC in R

The special way of speeding up the R-code

Package pnmath

Another easy way to get a speed-up is to use the pnmath package in R.This package takes many of the standard math functions in R andreplaces them with multi-threaded versions, using OpenMP. Somefunctions get more of a speed-up than others with pnmath.

#Generating random datav1 <- runif (1000)v2 <- runif (100000000)#Time of execution without pnmathsystem . time ( qtukey (v1 ,2,3))system . time ( exp (v2))system . time ( sqrt (v2))#Time of execution with pnmathlibrary (pnmath)system . time ( qtukey (v1 ,2,3))system . time ( exp (v2))system . time ( sqrt (v2))

Kutergin A. High performance computing with R

Page 75: HPC in R

Problem of data splitting

Our problem:

Before you start the calculation you need to split your data set accordingthe number of threads. Another reason is more effective data processingin loops

Package iterator

The iterators package provides tools for iterating over various R datastructures. Iterators are available for vectors, lists, matrices, arrays, dataframes and files. By following very simple conventions, new iterators canbe written to support any type of data source, such as database queriesor dynamically generating data

Download

You can download this useful package from CRAN (available forWindows!): http://cran.r-project.org/web/packages/iterators/index.html

Kutergin A. High performance computing with R

Page 76: HPC in R

Problem of data splitting

Our problem:

Before you start the calculation you need to split your data set accordingthe number of threads. Another reason is more effective data processingin loops

Package iterator

The iterators package provides tools for iterating over various R datastructures. Iterators are available for vectors, lists, matrices, arrays, dataframes and files. By following very simple conventions, new iterators canbe written to support any type of data source, such as database queriesor dynamically generating data

Download

You can download this useful package from CRAN (available forWindows!): http://cran.r-project.org/web/packages/iterators/index.html

Kutergin A. High performance computing with R

Page 77: HPC in R

Problem of data splitting

Our problem:

Before you start the calculation you need to split your data set accordingthe number of threads. Another reason is more effective data processingin loops

Package iterator

The iterators package provides tools for iterating over various R datastructures. Iterators are available for vectors, lists, matrices, arrays, dataframes and files. By following very simple conventions, new iterators canbe written to support any type of data source, such as database queriesor dynamically generating data

Download

You can download this useful package from CRAN (available forWindows!): http://cran.r-project.org/web/packages/iterators/index.html

Kutergin A. High performance computing with R

Page 78: HPC in R

Problem of data splitting

Our problem:

Before you start the calculation you need to split your data set accordingthe number of threads. Another reason is more effective data processingin loops

Package iterator

The iterators package provides tools for iterating over various R datastructures. Iterators are available for vectors, lists, matrices, arrays, dataframes and files. By following very simple conventions, new iterators canbe written to support any type of data source, such as database queriesor dynamically generating data

Download

You can download this useful package from CRAN (available forWindows!): http://cran.r-project.org/web/packages/iterators/index.html

Kutergin A. High performance computing with R

Page 79: HPC in R

Problem of data splitting: package iteratorsCapabilities

icount(count)

This method returns the iterator that counts starting from one. Count -number of times that iterator will be fire. If not specified, it will countforever

nextElem()

This function returns next value of pre-define iterator. When the iteratorhas no more values, it calls stop with massage "StopIteration"

Example:

library (iterators)#create an iterator that counts from 1 to 3.it <- icount (2)nextElem(it)[1] 1nextElem(it)[1] 2try (nextElem(it)) # expect a StopIteration exceptionError : StopIteration

Kutergin A. High performance computing with R

Page 80: HPC in R

Problem of data splitting: package iteratorsCapabilities

You can create iterators by rows of your data structure using iter()function:

library (iterators)#Creating iterator by rows of data setirState <- iter(state.x77 , by = " row ")nextElem(irState)

Population Income Illiteracy Life Murder AreaAlabama 3615 3624 2.1 69.05 15.1 50708nextElem(irState)

Population Income Illiteracy Life Murder AreaAlaska 365 6315 1.5 69.31 11.3 566432nextElem(irState)

Population Income Illiteracy Life Murder AreaArizona 2212 4530 1.8 70.55 7.8 113417

Kutergin A. High performance computing with R

Page 81: HPC in R

Problem of data splitting: package iteratorsCapabilities

You can create iterators by columns of your data structure using iter()

function:

#Creating iterator by columns of data seticState <- iter(state.x77 , by = " col ")nextElem(icState)

PopulationAlabama 3615Alaska 365Arizona 2212nextElem(icState)

IlliteracyAlabama 2.1Alaska 1.5Arizona 1.8nextElem(icState)

IncomeAlabama 3624Alaska 6315Arizona 4530

Kutergin A. High performance computing with R

Page 82: HPC in R

Problem of data splitting: package iteratorsCapabilities

You can create iterators using iter() function from data object returnedby some other function:library (iterators)#Define a function, wich generate random dataGetDataStructure <- function (meanVal1 ,meanVal2 ,

sdVal1 ,sdVal2){

a <- rnorm (4, mean = meanVal1 , sd = sdVal1)b <- rnorm (4, mean = meanVal2 , sd = sdVal2)data <- a%o%breturn ( data )

}ifun <-iter(GetDataStructure (25 ,27 ,2.5 ,3.5),by=" row ")nextElem(ifun);nextElem(ifun)

[,1] [,2] [,3] [,4][1,] 701.7055 939.6574 764.7724 799.6965

[,1] [,2] [,3] [,4][1,] 647.6349 867.2512 705.8422 738.0752

Kutergin A. High performance computing with R

Page 83: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 84: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 85: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 86: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 87: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 88: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 89: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 90: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 91: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 92: HPC in R

Problem of data splitting: package iteratorsCapabilities

idiv(n, chunk, chunksize)

This is more interesting iterator. It provides the ability to divide anumeric value into pieces

n - number of times that iterator will fire. If not specified, it willcount foreverchunks - the number of pieces that n should be divided into. Ituseful when you know the number of pieces that you want. Ifspecified, the chunkSize should not bechunkSize - the maximum size of the pieces, that n should bedivided into. It is useful when you know the size of the pieces thatyou want. If specified, the chunk should not be

Some thoughts...

However, practical application of this iterator is unclear. Perhaps it canbe used to index vector or rows/columns of arrays

Kutergin A. High performance computing with R

Page 93: HPC in R

Problem of data splitting: package iteratorsCapabilities

Example:

library (iterators)# divide the value 10 into 3 piecesit <- idiv(10, chunks =3)nextElem(it)[1] 4nextElem(it)[1] 3nextElem(it)[1] 3try (nextElem(it)) # expect a StopIteration exceptionError : StopIteration

Kutergin A. High performance computing with R

Page 94: HPC in R

Problem of data splitting: package iteratorsCapabilities

Example:

library (iterators)# divide the value 10 into pieces no larger than 3it <- idiv(10, chunkSize =3)nextElem(it)[1] 3nextElem(it)[1] 3nextElem(it)[1] 2nextElem(it)[1] 2try (nextElem(it)) # expect a StopIteration exceptionError : StopIteration

Kutergin A. High performance computing with R

Page 95: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 96: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 97: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 98: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 99: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 100: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 101: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 102: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 103: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 104: HPC in R

Problem of data splitting: package iteratorsCapabilities

iread.table(file,...., verbose = FALSE)

This is very important iterator. It returns an iterator object over the rowsof the data frame stored in a file in table format

file - the name of the file to read data from... - all additional arguments are passed on to the read.tablefunction. See the documentation for read.table for more informationverbose - logical flag indicating whether or not to print the calls toread.table

Note!In this version of iread.table, both the read.table arguments header androw.names must be specified. This is because the default values of thisarguments depend on the contents of the beginning of the file. In orderto make the subsequent calls to read.table work consistently, the usermust specified those arguments explicitly

Kutergin A. High performance computing with R

Page 105: HPC in R

Problem of data splitting: package iteratorsCapabilities

Example:

library (iterators)

#Gnerating random dataits <- 2000000; dim <- 3;data <- matrix ( rnorm (its * dim ),its , dim )

#Writing them to HDDDATA _ PATH <- " E : / R _ works / data . txt "#Size of this file - 123 Mbwrite . table ( data , file = DATA _PATH ,

append = FALSE , sep=" \ t ",dec = " . ")

Kutergin A. High performance computing with R

Page 106: HPC in R

Problem of data splitting: package iteratorsCapabilities

#Creating an iterator from these fileifile <- iread. table (DATA _PATH , header = TRUE ,

row . names = NULL , verbose = FALSE)row . names V1 V2 V3

1 1 -1.042623 -1.386382 0.399798> nextElem(ifile)

row . names V1 V2 V31 2 0.8841238 -1.296501 0.1580505> nextElem(ifile)

row . names V1 V2 V31 3 -0.3195784 -0.6830442 0.3647958

#It works very fast!!!!#remove the filefile . remove (DATA _ PATH)

Kutergin A. High performance computing with R

Page 107: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 108: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 109: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 110: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 111: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 112: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 113: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 114: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 115: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 116: HPC in R

Problem of data splitting: package iteratorsCapabilities

isplit(x, f, drop = FALSE)

Another important type of iterator. It returns the the iterator that dividesthe data in the vector x into the groups define by f

x - vector or data frame of values to be split into groupsf - a factor or list of factors used to categorize xdrop - logical indicating if levels that do not occur should bedropped More detailed information you can find in documentation

Note!This is very useful! For example, you have data-vector and vectorcontaining values of the factor corresponding these data. Factor haspre-defined levels. Thus, you can extract data in loop for each of thelevels of the factor without additional operations. Also you can define inloop’s body some conditions for each level of the factor and use thiscondition as a condition for if() control structures

Kutergin A. High performance computing with R

Page 117: HPC in R

Problem of data splitting: package iteratorsCapabilities

x <- rnorm (200)f <- factor ( sample (1:10, length (x),

replace =TRUE))it <- isplit(x, f)

nextElem(it)\$ value[1] 0.14087878 -0.94439161 0.13593045[4] -0.25732860 0.09422130 -0.55166303[7] -0.18325419 -0.00871019 0.38344388

[10] -1.05761926 1.16126462 -0.02280205[13] -0.67338941 1.68724264 0.92112983[16] 1.39782337 -0.51060989\$key\$key [[1]][1] " 1 "

Kutergin A. High performance computing with R

Page 118: HPC in R

Problem of data splitting: package iteratorsCapabilities

Special types of iterators

Also there are special types of iterators. Like: irnorm(..., cont) orirunif(..., count). These function returns an iterator that return randomnumber of various distributions. Each one is a wrapper around a standardR function

count - number of times that the iterator will fire. If not specified, itwill fire values forever... - arguments to pass to the underling rnorm function

Example:

# create an iterator that returns three random numbersit <- irnorm(1, count =2)nextElem(it);nextElem(it)[1] 0.1592311[1] -1.387449try (nextElem(it)) # expect a StopIteration exceptionError : StopIteration

Kutergin A. High performance computing with R

Page 119: HPC in R

Parallel computation with R: high-level parallelismpackages: parallel, snow

Scope

High-level parallelism means that you do not need to define ideology ofcommunication between thread. Which process is master, whichprocesses are slaves? You only initialize parallel environment and workinside it. All the details are on the shoulders of the package’s methods

Package: snow

Package contains the basic function allow you to create different type ofclusters on a multicore machine

Package: parallel

This package is an add-on packages multicore and snow and providesdrop- in replacements for most of the functionality of those packages

Kutergin A. High performance computing with R

Page 120: HPC in R

Parallel computation with R: high-level parallelismpackages: parallel, snow

Scope

High-level parallelism means that you do not need to define ideology ofcommunication between thread. Which process is master, whichprocesses are slaves? You only initialize parallel environment and workinside it. All the details are on the shoulders of the package’s methods

Package: snow

Package contains the basic function allow you to create different type ofclusters on a multicore machine

Package: parallel

This package is an add-on packages multicore and snow and providesdrop- in replacements for most of the functionality of those packages

Kutergin A. High performance computing with R

Page 121: HPC in R

Parallel computation with R: high-level parallelismpackages: parallel, snow

Scope

High-level parallelism means that you do not need to define ideology ofcommunication between thread. Which process is master, whichprocesses are slaves? You only initialize parallel environment and workinside it. All the details are on the shoulders of the package’s methods

Package: snow

Package contains the basic function allow you to create different type ofclusters on a multicore machine

Package: parallel

This package is an add-on packages multicore and snow and providesdrop- in replacements for most of the functionality of those packages

Kutergin A. High performance computing with R

Page 122: HPC in R

Parallel computation with R: high-level parallelismpackages: parallel, snow

Scope

High-level parallelism means that you do not need to define ideology ofcommunication between thread. Which process is master, whichprocesses are slaves? You only initialize parallel environment and workinside it. All the details are on the shoulders of the package’s methods

Package: snow

Package contains the basic function allow you to create different type ofclusters on a multicore machine

Package: parallel

This package is an add-on packages multicore and snow and providesdrop- in replacements for most of the functionality of those packages

Kutergin A. High performance computing with R

Page 123: HPC in R

Parallel computation with R: high-level parallelismpackage: parallel

Description

The landscape of parallel computing has changed with the advent ofshared-memory computers with multiple (and often many) CPU cores.Until the late 2000’s parallel computing was mainly done on clusters oflarge numbers of single- or dual-CPU computers: nowadays even laptopshave two or four cores, and servers with 8 or more cores arecommonplace. It is such hardware that package parallel is designed toexploit. It can also be used with several computers running the sameversion of R connected by (reasonable-speed) ethernet: the computersneed not be running the same OS

Scope

Parallelism can be done in computation at many different levels: thispackage is principally concerned with "coarse-grained parallelization"

Kutergin A. High performance computing with R

Page 124: HPC in R

Parallel computation with R: high-level parallelismpackage: parallel

Description

The landscape of parallel computing has changed with the advent ofshared-memory computers with multiple (and often many) CPU cores.Until the late 2000’s parallel computing was mainly done on clusters oflarge numbers of single- or dual-CPU computers: nowadays even laptopshave two or four cores, and servers with 8 or more cores arecommonplace. It is such hardware that package parallel is designed toexploit. It can also be used with several computers running the sameversion of R connected by (reasonable-speed) ethernet: the computersneed not be running the same OS

Scope

Parallelism can be done in computation at many different levels: thispackage is principally concerned with "coarse-grained parallelization"

Kutergin A. High performance computing with R

Page 125: HPC in R

Parallel computation with R: high-level parallelismpackage: parallel

Description

The landscape of parallel computing has changed with the advent ofshared-memory computers with multiple (and often many) CPU cores.Until the late 2000’s parallel computing was mainly done on clusters oflarge numbers of single- or dual-CPU computers: nowadays even laptopshave two or four cores, and servers with 8 or more cores arecommonplace. It is such hardware that package parallel is designed toexploit. It can also be used with several computers running the sameversion of R connected by (reasonable-speed) ethernet: the computersneed not be running the same OS

Scope

Parallelism can be done in computation at many different levels: thispackage is principally concerned with "coarse-grained parallelization"

Kutergin A. High performance computing with R

Page 126: HPC in R

Parallel computation with R: high-level parallelismpackage: parallel

Computational model

This package handles running much larger chunks of computations inparallel. The crucial point is that these chunks of computation areunrelated and do not need to communicate in any way. It is often thecase that the chunks take approximately the same length of time. Thebasic computational model is( a ) Start up M "worker"processes, and do any initialization needed on

the workers( b ) Send any data required for each task to the workers( c ) Split the task into M roughly equally-sized chunks, and send the

chunks (including the Rcode needed) to the workers( d ) Wait for all the workers to complete their tasks, and ask them for

their results( e ) Repeat steps (b - d) for any further tasks( f ) Shut down the worker processes

Kutergin A. High performance computing with R

Page 127: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 128: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 129: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 130: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 131: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 132: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 133: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 134: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 135: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 136: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 137: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 138: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 139: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 140: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 141: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 142: HPC in R

Parallel computation with R: high-level parallelismpackage: Snow

Description

Package contains the basic function allow you to create different type ofclusters on a multicore machine. Like

makeSOCKcluster(names, ..., options = defaultClusterOptions)makeMPIcluster(count, ..., options = defaultClusterOptions)

Also it contains specific functions for computing on SNOW clusters. Like:clusterCall(cl, fun, ...) calls a function fun with identical arguments... on each node in the cluster cl and returns a list of the resultsclusterEvalQ(cl, expr) evaluates a literal expression on each clusternode. It is a cluster version of evalqclusterApply(cl, x, fun, ...) calls fun on the first cluster node witharguments seq[[1]] and ..., on the second node with argumentsseq[[2]] and ..., and so on.

It makes no sense to go into further syntax. All details you can find indocumentation

Kutergin A. High performance computing with R

Page 143: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 144: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 145: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 146: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 147: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 148: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 149: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 150: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 151: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 152: HPC in R

Parallel computation with R: high-level parallelismpackages: doParallel, doSNOW

Package: doSNOW

The registerDoSNOW(cl) function is used to register the SNOW parallelbackend with the foreach package. Where cl - the cluster object to usefor parallel execution

Package: doParallel

The registerDoParallel(cl, cores=NULL, ...) package provides a parallelbackend for the foreach function using the parallel package. Where

cl - a cluster object returned by makeCluster, or the number of coresto be created in the cluster. If not specified, on Windows a threeworker cluster is created and usedcores - the number of cores to use for parallel execution... - package options

Kutergin A. High performance computing with R

Page 153: HPC in R

Parallel computation with R: high-level parallelismExample of cluster based on parallel package

library (parallel)library (doParallel)#Detect how many cores we haveCoresCount <- detectCores ();CoresCount[1] 4> #Initializing the clustercl <- makeCluster(CoresCount);clsocket cluster with 4 nodes on host ‘’localhost#How many cores of our cluster we are going to useCoresCountForeUse <-

CoresCount;CoresCountForeUse[1] 4#Register parallel backendregisterDoParallel(cl, cores=CoresCountForeUse)

#Some expresions#Stop our clusterstopCluster(cl)

Kutergin A. High performance computing with R

Page 154: HPC in R

Parallel computation with R: high-level parallelismExample of cluster based on snow package

library (snow)library (doSNOW)

#Make socket cluster with four threadsclSnow <- makeCluster(c(" lo ca lh ost ",

" lo ca lh ost ", " lo ca lh ost ", " lo ca lh ost "), type= " SOCK ")

clSnowsocket cluster with 4 nodes on host ‘’localhost

registerDoSNOW(clSnow)#Some expresions

#Stop our clusterstopCluster(clSnow)

Kutergin A. High performance computing with R

Page 155: HPC in R

Parallel computation with R: low-level parallelismPackage: Rmpi

Description

This is a basic tutorial on parallel programming in R using Rmpi, the MPIinterface for R. This R package allow you to create R programs which runcooperatively in parallel across multiple machines, or multiple CPUs onone machine, to accomplish a goal more quickly than running a singleprogram on one machine

So...I have not worked with this package yet, thus I can’t say much about it.This work is on process

Kutergin A. High performance computing with R

Page 156: HPC in R

Parallel computation with R: low-level parallelismPackage: Rmpi

Description

This is a basic tutorial on parallel programming in R using Rmpi, the MPIinterface for R. This R package allow you to create R programs which runcooperatively in parallel across multiple machines, or multiple CPUs onone machine, to accomplish a goal more quickly than running a singleprogram on one machine

So...I have not worked with this package yet, thus I can’t say much about it.This work is on process

Kutergin A. High performance computing with R

Page 157: HPC in R

Parallel computation with R: low-level parallelismPackage: Rmpi

Description

This is a basic tutorial on parallel programming in R using Rmpi, the MPIinterface for R. This R package allow you to create R programs which runcooperatively in parallel across multiple machines, or multiple CPUs onone machine, to accomplish a goal more quickly than running a singleprogram on one machine

So...I have not worked with this package yet, thus I can’t say much about it.This work is on process

Kutergin A. High performance computing with R

Page 158: HPC in R

Parallel computation with R: parallel execution of for-loopsPackage: foreach

MotivationIn many practical cases it is impossible to avoid the usage of loop. Loopsare slow and it will be great to reach the speed of loop’s execution

Description

The foreach package provides new looping construct for executing R coderepeatedly. The main reason for using the foreach package is that itsupports parallel execution. The foreach package can be used with avariety of different parallel computing systems, include NetWorkSpacesand snow. In addition, foreach can be used with iterators, which allowsthe data to specified in a very flexible way

Note!Foreach structures work in parallel only inside initialized parallelenvironment! You can used it in parallel only inside parallel or snowclusters

Kutergin A. High performance computing with R

Page 159: HPC in R

Parallel computation with R: parallel execution of for-loopsPackage: foreach

MotivationIn many practical cases it is impossible to avoid the usage of loop. Loopsare slow and it will be great to reach the speed of loop’s execution

Description

The foreach package provides new looping construct for executing R coderepeatedly. The main reason for using the foreach package is that itsupports parallel execution. The foreach package can be used with avariety of different parallel computing systems, include NetWorkSpacesand snow. In addition, foreach can be used with iterators, which allowsthe data to specified in a very flexible way

Note!Foreach structures work in parallel only inside initialized parallelenvironment! You can used it in parallel only inside parallel or snowclusters

Kutergin A. High performance computing with R

Page 160: HPC in R

Parallel computation with R: parallel execution of for-loopsPackage: foreach

MotivationIn many practical cases it is impossible to avoid the usage of loop. Loopsare slow and it will be great to reach the speed of loop’s execution

Description

The foreach package provides new looping construct for executing R coderepeatedly. The main reason for using the foreach package is that itsupports parallel execution. The foreach package can be used with avariety of different parallel computing systems, include NetWorkSpacesand snow. In addition, foreach can be used with iterators, which allowsthe data to specified in a very flexible way

Note!Foreach structures work in parallel only inside initialized parallelenvironment! You can used it in parallel only inside parallel or snowclusters

Kutergin A. High performance computing with R

Page 161: HPC in R

Parallel computation with R: parallel execution of for-loopsPackage: foreach

MotivationIn many practical cases it is impossible to avoid the usage of loop. Loopsare slow and it will be great to reach the speed of loop’s execution

Description

The foreach package provides new looping construct for executing R coderepeatedly. The main reason for using the foreach package is that itsupports parallel execution. The foreach package can be used with avariety of different parallel computing systems, include NetWorkSpacesand snow. In addition, foreach can be used with iterators, which allowsthe data to specified in a very flexible way

Note!Foreach structures work in parallel only inside initialized parallelenvironment! You can used it in parallel only inside parallel or snowclusters

Kutergin A. High performance computing with R

Page 162: HPC in R

Parallel computation with R: parallel execution of for-loopsOperators used with foreach object

Operator %do%

It is a binary operator that operate on a foreach object and R expression.The expression is evaluated multiple times in an environment that iscreated by the foreach object, and that environment is modified for eachevaluation as specified by the foreach object. %do% evaluate theexpression sequentially. The results of evaluating expression are returnedas a list by default

Operator %dopar%

%dopar% is a parallel version of %do% operator. It evaluates expressionin parallel

Operator %:%

The operator %:% is called nested operator. It is a binary operator usedto merge two foreach objects into single structure

Kutergin A. High performance computing with R

Page 163: HPC in R

Parallel computation with R: parallel execution of for-loopsOperators used with foreach object

Operator %do%

It is a binary operator that operate on a foreach object and R expression.The expression is evaluated multiple times in an environment that iscreated by the foreach object, and that environment is modified for eachevaluation as specified by the foreach object. %do% evaluate theexpression sequentially. The results of evaluating expression are returnedas a list by default

Operator %dopar%

%dopar% is a parallel version of %do% operator. It evaluates expressionin parallel

Operator %:%

The operator %:% is called nested operator. It is a binary operator usedto merge two foreach objects into single structure

Kutergin A. High performance computing with R

Page 164: HPC in R

Parallel computation with R: parallel execution of for-loopsOperators used with foreach object

Operator %do%

It is a binary operator that operate on a foreach object and R expression.The expression is evaluated multiple times in an environment that iscreated by the foreach object, and that environment is modified for eachevaluation as specified by the foreach object. %do% evaluate theexpression sequentially. The results of evaluating expression are returnedas a list by default

Operator %dopar%

%dopar% is a parallel version of %do% operator. It evaluates expressionin parallel

Operator %:%

The operator %:% is called nested operator. It is a binary operator usedto merge two foreach objects into single structure

Kutergin A. High performance computing with R

Page 165: HPC in R

Parallel computation with R: parallel execution of for-loopsOperators used with foreach object

Operator %do%

It is a binary operator that operate on a foreach object and R expression.The expression is evaluated multiple times in an environment that iscreated by the foreach object, and that environment is modified for eachevaluation as specified by the foreach object. %do% evaluate theexpression sequentially. The results of evaluating expression are returnedas a list by default

Operator %dopar%

%dopar% is a parallel version of %do% operator. It evaluates expressionin parallel

Operator %:%

The operator %:% is called nested operator. It is a binary operator usedto merge two foreach objects into single structure

Kutergin A. High performance computing with R

Page 166: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 167: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 168: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 169: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 170: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 171: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 172: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 173: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 174: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.combine - function that is used to process the tasks results as theygenerated. This can be specified as a non-empty character stringnaming the function. Specifying "c"is useful to concatenating theresults into a vector. The values "rbind"and "cbind"can combinevectors into matrix. The values "+"and "*"can used to processnumeric data.inorder - logical flag indicating whether the .combine functionrequires the task results to be combined in the same order that theywere submitted. If the order is not important, then it setting .inorderto FALSE can give improved performance.multicombine - logical flag indicating whether .combine functioncan accept more then to arguments. If it can take more then twoarguments, then setting .multicombine to TRUE could improve theperformance

Kutergin A. High performance computing with R

Page 175: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 176: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 177: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 178: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 179: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 180: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 181: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 182: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 183: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 184: HPC in R

Parallel computation with R: parallel execution of for-loopsMain arguments of the foreach function

Note! This is important

.errorhandling - specifies how a task evalution error should behandled. If the value is "stop then execution will be stopped if anerror occures. If the value is "remove the result for that task will notbe returned, or passed to the .combine function. If it is "pass thenthe error object generated by task evaluation will be included withthe rest of the results. It is assumed that the combine function willbe able to deal with the error object.packages - character vector of packages that the tasks depend on.verbose - logical flag enabling verbose messages. This can be veryuseful for trouble shooting

Further immersionAs always, you can find all detailed information in documentation for thisuseful package

Kutergin A. High performance computing with R

Page 185: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#sequentiallytime _ seq <- system . time ( foreach (i=1:100)

% do% { sum ( runif (10000000))})time _ seq [3]31.06#in paralleltime _ par <- system . time ( foreach (i=1:100)

%dopar% { sum ( runif (10000000))})time _ par [3]15.25#accelerationacceleration <- time _ seq [3] / time _ par [3]accelerationelapsed2.036721

Kutergin A. High performance computing with R

Page 186: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#sequentiallytime _ seq <- system . time ( foreach (i=1:100)

% do%{ sum ( sin ( runif (10000000)))})

time _ seq [3]87.46#in paralleltime _ par <- system . time ( foreach (i=1:100)

%dopar%{ sum ( sin ( runif (10000000)))})

time _ par [3]33.82#accelerationacceleration <- time _ seq [3] / time _ par [3]accelerationelapsed

2.586044

Kutergin A. High performance computing with R

Page 187: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Combine results as a vectorforeachResult <- foreach (i=1:100)

%dopar%{ sum ( runif (10000000))}

class (foreachResult)[1] " list "nrow (foreachResult)NULLncol (foreachResult)NULLlength (foreachResult)[1] 100

Kutergin A. High performance computing with R

Page 188: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Combine results as matrix by columns> foreachResult2 <- foreach (i=1:100 , .combine =

" cbind ") %dopar% { sum ( runif (10000000))}class (foreachResult2)[1] " matrix "nrow (foreachResult2)[1] 1ncol (foreachResult2)[1] 100

Kutergin A. High performance computing with R

Page 189: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Combine results as matrix by columnsforeachResult3 <- foreach (i=1:100 , .combine =

" rbind ") %dopar% { sum ( runif (10000000))}class (foreachResult3)[1] " matrix "nrow (foreachResult3)[1] 100ncol (foreachResult3)[1] 1

Kutergin A. High performance computing with R

Page 190: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#parallel, .multicombine = FALSE, .inorder = TRUEtime1 <- system . time ( foreach (i=1:100 ,

.combine=" rbind ", .multicombine = FALSE ,

.inorder = TRUE) %dopar%{ sum ( runif (10000000))})

time1 [3]elapsed

15.13#parallel .multicombine = TRUE и .inorder = FALSEtime2 <- system . time ( foreach (i=1:100 ,

.combine=" rbind ", .multicombine = TRUE ,

.inorder = FALSE) %dopar%{ sum ( runif (10000000))})

time2 [3]elapsed

15.02

Kutergin A. High performance computing with R

Page 191: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#parallel, list as a resulttime _ list <- system . time ( foreach (i=1:100)

%dopar% { sum ( runif (10000000))})time _ list [3]elapsed

15.24acceleration <- time1 [3] / time2 [3]accelerationelapsed1.007324accelerationList1 <- time _ list [3] / time1 [3]accelerationList1elapsed1.00727accelerationList2 <- time _ list [3] / time2 [3]accelerationList2elapsed1.014647

Kutergin A. High performance computing with R

Page 192: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

start <- proc . time ()SomeResult <- foreach (i=1:4, .combine = sum )

% do%{

foreach (k=1:1000 , .combine=" c ",.multicombine = TRUE , .inorder = FALSE)% do%

{sin (i)* cos (k)

}}end <- proc . time () - startend1.76SomeResult[1] 0.6106603

Kutergin A. High performance computing with R

Page 193: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

start <- proc . time ()SomeResult <- foreach (i=1:4, .combine = sum )

% do%{

foreach (k=1:1000 , .combine=" c ",.multicombine = TRUE , .inorder = FALSE)%dopar%

{sin (i)* cos (k)

}}end <- proc . time () - startend35.79SomeResult[1] 0.6106603

Kutergin A. High performance computing with R

Page 194: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

However, this construction does not work. It’s sad...

#Not runstart <- proc . time ()SomeResult <- foreach (i=1:4, .combine = sum )

%dopar%{

foreach (k=1:1000 , .combine=" c ",.multicombine = TRUE , .inorder = FALSE)% do%

{sin (i)* cos (k)

}}end <- proc . time () - startendSomeResult

Kutergin A. High performance computing with R

Page 195: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

So, how to execute four task (each has 10000000 iterations) into fourthreads in parallel

#Define a function#This function emulates our single 10000000-iteration task#inside foreach loop#This is necessary because only internal foreach loop#can be execute in parallel modGetSomeData <- function (indexVal){

tmpData <- rep (NA , length = 10000000)for (j in 1:10000000){

tmpData[j] <- sin (indexVal)* cos (j)}return (tmpData)

}

Kutergin A. High performance computing with R

Page 196: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Four tasks, each has 10000000 iterations#sequentiallystart <- proc . time ()SomeResult <- foreach (i=1:4, .combine = sum ,

.multicombine = TRUE , .inorder = FALSE) % do%{

GetSomeData(i)}end <- proc . time () - startend120.49SomeResult[1] -0.645559

Kutergin A. High performance computing with R

Page 197: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Parallel execution#So, here we send 10000000 iterations for each threadstart <- proc . time ()SomeResult <- foreach (i=1:4, .combine = sum ,

.multicombine = TRUE , .inorder = FALSE)%dopar%

{GetSomeData(i)

}end <- proc . time () - startend60.76SomeResult[1] -0.645559

Kutergin A. High performance computing with R

Page 198: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Using Nested operatorstart <- proc . time ()SomeResult <- foreach (i=1:4, .combine = sum ) %:%

foreach (k=1:1000 , .combine=" c ",.multicombine = TRUE , .inorder = FALSE)% do%

{sin (i)* cos (k)

}end2 <- proc . time () - startend22.19SomeResult[1] 0.6106603

Kutergin A. High performance computing with R

Page 199: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Using Nested operatorstart <- proc . time ()SomeResult <- foreach (i=1:4, .combine = sum ) %:%

foreach (k=1:1000 , .combine=" c ",.multicombine = TRUE , .inorder = FALSE)%dopar%

{sin (i)* cos (k)

}end2 <- proc . time () - startend235.44SomeResult[1] 0.6106603

Kutergin A. High performance computing with R

Page 200: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Using iterators and foreach together#Define some function

simFun <- function (arg1 , arg2){

tmp <- 2* arg1 + 3* arg2return (tmp)

}

#Generate some random dataavec <- rnorm (1000, 22, 3)bvec <- rnorm (1000, 24, 5)

Kutergin A. High performance computing with R

Page 201: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Initializing iteratorsiavec <- iter(avec)ibvec <- iter(bvec)

start <- proc . time ()seqSimulationresult <- foreach (i=iavec ,

.combine = " cbind ") %:%foreach (j=ibvec , .combine=" c ")

% do%{

simFun(i, j)}

end <- proc . time () - startend4.90

Kutergin A. High performance computing with R

Page 202: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

#Initializing iteratorsiavec <- iter(avec)ibvec <- iter(bvec)

start <- proc . time ()parSimulationresult <- foreach (i=iavec ,

.combine = " cbind ") %:%foreach (j=ibvec , .combine=" c ")

%dopar%{

simFun(i, j)}

end <- proc . time () - startend13.57

Kutergin A. High performance computing with R

Page 203: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

This example uses all tricks

#Generating gridx <- seq (-10, 10, by =0.1)y <- seq (-10, 10, by =0.1)

start <- proc . time ()z <- foreach (y=ivector(x, 4), .combine= cbind )

%dopar%{

y <- rep (y, each= length (x))del <- abs (1+(x ^ 2 + y ^ 2)^0.7)r <- (x ^ 2 + y ^ 2) /2matrix (10 * sin (r) / del , length (x))

}end <- proc . time () - startend0.37

Kutergin A. High performance computing with R

Page 204: HPC in R

Parallel computation with R: parallel execution of for-loopsExamples of foreach usage

Result of this code

#Plot the results as a perspective plotpersp (x, x, z, ylab= ’y ’, theta =30, phi=30,

expand =0.5, col =" li gh tb lue ")

Kutergin A. High performance computing with R

Page 205: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Package: gputools

This package provides R interfaces to a handful of commonstatistical algorithms. These algorithms are implemented in parallelusing a mixture of Nvidia’s CUDA langauge, Nvidia’s CUBLASlibrary, and EMI Photonics’ CULA librariesOn a computer equiped with an Nvidia GPU some of these functionsmay be substantially more efficient than native R routines

Note!Simply put, this package contains a set of specialized functions that canuse GPU for computing. Full list of the functions with description you canfind in documentation. However, this package is available only for linux

Kutergin A. High performance computing with R

Page 206: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Package: gputools

This package provides R interfaces to a handful of commonstatistical algorithms. These algorithms are implemented in parallelusing a mixture of Nvidia’s CUDA langauge, Nvidia’s CUBLASlibrary, and EMI Photonics’ CULA librariesOn a computer equiped with an Nvidia GPU some of these functionsmay be substantially more efficient than native R routines

Note!Simply put, this package contains a set of specialized functions that canuse GPU for computing. Full list of the functions with description you canfind in documentation. However, this package is available only for linux

Kutergin A. High performance computing with R

Page 207: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Package: gputools

This package provides R interfaces to a handful of commonstatistical algorithms. These algorithms are implemented in parallelusing a mixture of Nvidia’s CUDA langauge, Nvidia’s CUBLASlibrary, and EMI Photonics’ CULA librariesOn a computer equiped with an Nvidia GPU some of these functionsmay be substantially more efficient than native R routines

Note!Simply put, this package contains a set of specialized functions that canuse GPU for computing. Full list of the functions with description you canfind in documentation. However, this package is available only for linux

Kutergin A. High performance computing with R

Page 208: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Package: gputools

This package provides R interfaces to a handful of commonstatistical algorithms. These algorithms are implemented in parallelusing a mixture of Nvidia’s CUDA langauge, Nvidia’s CUBLASlibrary, and EMI Photonics’ CULA librariesOn a computer equiped with an Nvidia GPU some of these functionsmay be substantially more efficient than native R routines

Note!Simply put, this package contains a set of specialized functions that canuse GPU for computing. Full list of the functions with description you canfind in documentation. However, this package is available only for linux

Kutergin A. High performance computing with R

Page 209: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Package: gputools

This package provides R interfaces to a handful of commonstatistical algorithms. These algorithms are implemented in parallelusing a mixture of Nvidia’s CUDA langauge, Nvidia’s CUBLASlibrary, and EMI Photonics’ CULA librariesOn a computer equiped with an Nvidia GPU some of these functionsmay be substantially more efficient than native R routines

Note!Simply put, this package contains a set of specialized functions that canuse GPU for computing. Full list of the functions with description you canfind in documentation. However, this package is available only for linux

Kutergin A. High performance computing with R

Page 210: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Package: gputools

This package provides R interfaces to a handful of commonstatistical algorithms. These algorithms are implemented in parallelusing a mixture of Nvidia’s CUDA langauge, Nvidia’s CUBLASlibrary, and EMI Photonics’ CULA librariesOn a computer equiped with an Nvidia GPU some of these functionsmay be substantially more efficient than native R routines

Note!Simply put, this package contains a set of specialized functions that canuse GPU for computing. Full list of the functions with description you canfind in documentation. However, this package is available only for linux

Kutergin A. High performance computing with R

Page 211: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Package: gputools

This package provides R interfaces to a handful of commonstatistical algorithms. These algorithms are implemented in parallelusing a mixture of Nvidia’s CUDA langauge, Nvidia’s CUBLASlibrary, and EMI Photonics’ CULA librariesOn a computer equiped with an Nvidia GPU some of these functionsmay be substantially more efficient than native R routines

Note!Simply put, this package contains a set of specialized functions that canuse GPU for computing. Full list of the functions with description you canfind in documentation. However, this package is available only for linux

Kutergin A. High performance computing with R

Page 212: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Package: gputools

This package provides R interfaces to a handful of commonstatistical algorithms. These algorithms are implemented in parallelusing a mixture of Nvidia’s CUDA langauge, Nvidia’s CUBLASlibrary, and EMI Photonics’ CULA librariesOn a computer equiped with an Nvidia GPU some of these functionsmay be substantially more efficient than native R routines

Note!Simply put, this package contains a set of specialized functions that canuse GPU for computing. Full list of the functions with description you canfind in documentation. However, this package is available only for linux

Kutergin A. High performance computing with R

Page 213: HPC in R

Parallel computation with RParallel computation with graphical processing unit

Some short example gputools usage:

#GPU. Here is an example:library (gputools)matA <- matrix ( runif (3 *2), 3, 2)matB <- matrix ( runif (3 *4), 3, 4)#Perform Matrix Cross-product with a GPUgpuCrossprod(matA , matB)numVectors <- 5dimension <- 10Vectors <- matrix ( runif (numVectors * dimension),

>numVectors , dimension)gpuDist(Vectors , " eu cl id ean ")gpuDist(Vectors , " maximum ")gpuDist(Vectors , " ma nh at tan ")gpuDist(Vectors , " mi nk ow ski ", 4)

Kutergin A. High performance computing with R

Page 214: HPC in R

Working with vary large datasetsPackage bigmemory

MotivationMulti-gigabyte data sets challenge and frustrate R users, even onwell-equipped hardware. Use of C/C++ can provide efficiencies, but iscumbersome for interactive data analysis and lacks the flexibility andpower of R’s rich statistical programming environment

Description

The package bigmemory and sister packages bridge this gap,implementing massive matrices and supporting their manipulationand explorationThe data structures may be allocated to shared memory, allowingseparate processes on the same computer to share access to a singlecopy of the data setThe data structures may also be file-backed, allowing users to easilymanage and analyze data sets larger than available RAM and sharethem across nodes of a cluster

Kutergin A. High performance computing with R

Page 215: HPC in R

Working with vary large datasetsPackage bigmemory

MotivationMulti-gigabyte data sets challenge and frustrate R users, even onwell-equipped hardware. Use of C/C++ can provide efficiencies, but iscumbersome for interactive data analysis and lacks the flexibility andpower of R’s rich statistical programming environment

Description

The package bigmemory and sister packages bridge this gap,implementing massive matrices and supporting their manipulationand explorationThe data structures may be allocated to shared memory, allowingseparate processes on the same computer to share access to a singlecopy of the data setThe data structures may also be file-backed, allowing users to easilymanage and analyze data sets larger than available RAM and sharethem across nodes of a cluster

Kutergin A. High performance computing with R

Page 216: HPC in R

Working with vary large datasetsPackage bigmemory

MotivationMulti-gigabyte data sets challenge and frustrate R users, even onwell-equipped hardware. Use of C/C++ can provide efficiencies, but iscumbersome for interactive data analysis and lacks the flexibility andpower of R’s rich statistical programming environment

Description

The package bigmemory and sister packages bridge this gap,implementing massive matrices and supporting their manipulationand explorationThe data structures may be allocated to shared memory, allowingseparate processes on the same computer to share access to a singlecopy of the data setThe data structures may also be file-backed, allowing users to easilymanage and analyze data sets larger than available RAM and sharethem across nodes of a cluster

Kutergin A. High performance computing with R

Page 217: HPC in R

Working with vary large datasetsPackage bigmemory

MotivationMulti-gigabyte data sets challenge and frustrate R users, even onwell-equipped hardware. Use of C/C++ can provide efficiencies, but iscumbersome for interactive data analysis and lacks the flexibility andpower of R’s rich statistical programming environment

Description

The package bigmemory and sister packages bridge this gap,implementing massive matrices and supporting their manipulationand explorationThe data structures may be allocated to shared memory, allowingseparate processes on the same computer to share access to a singlecopy of the data setThe data structures may also be file-backed, allowing users to easilymanage and analyze data sets larger than available RAM and sharethem across nodes of a cluster

Kutergin A. High performance computing with R

Page 218: HPC in R

Working with vary large datasetsPackage bigmemory

MotivationMulti-gigabyte data sets challenge and frustrate R users, even onwell-equipped hardware. Use of C/C++ can provide efficiencies, but iscumbersome for interactive data analysis and lacks the flexibility andpower of R’s rich statistical programming environment

Description

The package bigmemory and sister packages bridge this gap,implementing massive matrices and supporting their manipulationand explorationThe data structures may be allocated to shared memory, allowingseparate processes on the same computer to share access to a singlecopy of the data setThe data structures may also be file-backed, allowing users to easilymanage and analyze data sets larger than available RAM and sharethem across nodes of a cluster

Kutergin A. High performance computing with R

Page 219: HPC in R

Working with vary large datasetsPackage bigmemory

MotivationMulti-gigabyte data sets challenge and frustrate R users, even onwell-equipped hardware. Use of C/C++ can provide efficiencies, but iscumbersome for interactive data analysis and lacks the flexibility andpower of R’s rich statistical programming environment

Description

The package bigmemory and sister packages bridge this gap,implementing massive matrices and supporting their manipulationand explorationThe data structures may be allocated to shared memory, allowingseparate processes on the same computer to share access to a singlecopy of the data setThe data structures may also be file-backed, allowing users to easilymanage and analyze data sets larger than available RAM and sharethem across nodes of a cluster

Kutergin A. High performance computing with R

Page 220: HPC in R

Working with vary large datasetsPackage bigmemory

MotivationMulti-gigabyte data sets challenge and frustrate R users, even onwell-equipped hardware. Use of C/C++ can provide efficiencies, but iscumbersome for interactive data analysis and lacks the flexibility andpower of R’s rich statistical programming environment

Description

The package bigmemory and sister packages bridge this gap,implementing massive matrices and supporting their manipulationand explorationThe data structures may be allocated to shared memory, allowingseparate processes on the same computer to share access to a singlecopy of the data setThe data structures may also be file-backed, allowing users to easilymanage and analyze data sets larger than available RAM and sharethem across nodes of a cluster

Kutergin A. High performance computing with R

Page 221: HPC in R

Working with vary large datasetsPackage bigmemory

MotivationMulti-gigabyte data sets challenge and frustrate R users, even onwell-equipped hardware. Use of C/C++ can provide efficiencies, but iscumbersome for interactive data analysis and lacks the flexibility andpower of R’s rich statistical programming environment

Description

The package bigmemory and sister packages bridge this gap,implementing massive matrices and supporting their manipulationand explorationThe data structures may be allocated to shared memory, allowingseparate processes on the same computer to share access to a singlecopy of the data setThe data structures may also be file-backed, allowing users to easilymanage and analyze data sets larger than available RAM and sharethem across nodes of a cluster

Kutergin A. High performance computing with R

Page 222: HPC in R

Working with vary large datasetsBigmemory usage examples

#Here is an example that uses a very, very large matrix#This example illustrates how to work with a#big.matrix: no 2147483648 object size limitation.library (bigmemory)R <- 3e9 # 3 billion rowsC <- 2 # 2 columnsprint (" 48 ␣ GB ␣ total ␣ size : ")R * C *8 # 48 GB total sizex <- filebacked.big. matrix (R, C,

type= ’ double ’,backingfile= ’ huge - data . bin ’,descriptorfile= ’ huge - data . desc ’)

#Generates huge-data.bin and huge-data.desc files.#Now we can use huge-data.desc file in any R session.x[1,] <- rnorm (C)x[ nrow (x),] <- runif (C)summary (x[1,])summary (x[ nrow (x) ,])#Note: This example will leave a 48 GB on your hard drive!

Kutergin A. High performance computing with R

Page 223: HPC in R

Working with vary large datasetsPackage filehash

MotivationWorking with large datasets in R can be cumbersome because of theneed to keep objects in physical memory. While many mightgenerally see that as a feature of the system, the need to keep wholeobjects in memory creates challenges to those who might want towork interactively with large datasetsHere we take a simple definition of "large dataset"to be any datasetthat cannot be loaded into R as a single R object because ofmemory limitations. For example, a very large data frame might betoo large for all of the columns and rows to be loaded at once. Insuch a situation, one might load only a subset of the rows orcolumns, if that is possible

Kutergin A. High performance computing with R

Page 224: HPC in R

Working with vary large datasetsPackage filehash

MotivationWorking with large datasets in R can be cumbersome because of theneed to keep objects in physical memory. While many mightgenerally see that as a feature of the system, the need to keep wholeobjects in memory creates challenges to those who might want towork interactively with large datasetsHere we take a simple definition of "large dataset"to be any datasetthat cannot be loaded into R as a single R object because ofmemory limitations. For example, a very large data frame might betoo large for all of the columns and rows to be loaded at once. Insuch a situation, one might load only a subset of the rows orcolumns, if that is possible

Kutergin A. High performance computing with R

Page 225: HPC in R

Working with vary large datasetsPackage filehash

MotivationWorking with large datasets in R can be cumbersome because of theneed to keep objects in physical memory. While many mightgenerally see that as a feature of the system, the need to keep wholeobjects in memory creates challenges to those who might want towork interactively with large datasetsHere we take a simple definition of "large dataset"to be any datasetthat cannot be loaded into R as a single R object because ofmemory limitations. For example, a very large data frame might betoo large for all of the columns and rows to be loaded at once. Insuch a situation, one might load only a subset of the rows orcolumns, if that is possible

Kutergin A. High performance computing with R

Page 226: HPC in R

Working with vary large datasetsPackage filehash

MotivationWorking with large datasets in R can be cumbersome because of theneed to keep objects in physical memory. While many mightgenerally see that as a feature of the system, the need to keep wholeobjects in memory creates challenges to those who might want towork interactively with large datasetsHere we take a simple definition of "large dataset"to be any datasetthat cannot be loaded into R as a single R object because ofmemory limitations. For example, a very large data frame might betoo large for all of the columns and rows to be loaded at once. Insuch a situation, one might load only a subset of the rows orcolumns, if that is possible

Kutergin A. High performance computing with R

Page 227: HPC in R

Working with vary large datasetsPackage filehash

MotivationWorking with large datasets in R can be cumbersome because of theneed to keep objects in physical memory. While many mightgenerally see that as a feature of the system, the need to keep wholeobjects in memory creates challenges to those who might want towork interactively with large datasetsHere we take a simple definition of "large dataset"to be any datasetthat cannot be loaded into R as a single R object because ofmemory limitations. For example, a very large data frame might betoo large for all of the columns and rows to be loaded at once. Insuch a situation, one might load only a subset of the rows orcolumns, if that is possible

Kutergin A. High performance computing with R

Page 228: HPC in R

Working with vary large datasetsPackage filehash

MotivationWorking with large datasets in R can be cumbersome because of theneed to keep objects in physical memory. While many mightgenerally see that as a feature of the system, the need to keep wholeobjects in memory creates challenges to those who might want towork interactively with large datasetsHere we take a simple definition of "large dataset"to be any datasetthat cannot be loaded into R as a single R object because ofmemory limitations. For example, a very large data frame might betoo large for all of the columns and rows to be loaded at once. Insuch a situation, one might load only a subset of the rows orcolumns, if that is possible

Kutergin A. High performance computing with R

Page 229: HPC in R

Working with vary large datasetsPackage filehash

Description

The filehash package provides a full read-write implementation of akey-value database for R. The package does not depend on any externalpackages or software systems and is written entirely in R, making itreadily usable on most platforms. The filehash package can be thought ofas a specific implementation of the database concept, taking a slightlydifferent approach to the problem

Technical NoteKey-value databases are sometimes called hash tables. With filehash thevalues are stored in a file on the disk rather than in memory. When a userrequests the values associated with a key, filehash finds the object on thedisk, loads the value into R and returns it to the user. The package offerstwo formats for storing data on the disk: The values can be stored (1)concatenated together in a single file or (2) separately as a directory offiles

Kutergin A. High performance computing with R

Page 230: HPC in R

Working with vary large datasetsPackage filehash

Description

The filehash package provides a full read-write implementation of akey-value database for R. The package does not depend on any externalpackages or software systems and is written entirely in R, making itreadily usable on most platforms. The filehash package can be thought ofas a specific implementation of the database concept, taking a slightlydifferent approach to the problem

Technical NoteKey-value databases are sometimes called hash tables. With filehash thevalues are stored in a file on the disk rather than in memory. When a userrequests the values associated with a key, filehash finds the object on thedisk, loads the value into R and returns it to the user. The package offerstwo formats for storing data on the disk: The values can be stored (1)concatenated together in a single file or (2) separately as a directory offiles

Kutergin A. High performance computing with R

Page 231: HPC in R

Working with vary large datasetsPackage filehash

Description

The filehash package provides a full read-write implementation of akey-value database for R. The package does not depend on any externalpackages or software systems and is written entirely in R, making itreadily usable on most platforms. The filehash package can be thought ofas a specific implementation of the database concept, taking a slightlydifferent approach to the problem

Technical NoteKey-value databases are sometimes called hash tables. With filehash thevalues are stored in a file on the disk rather than in memory. When a userrequests the values associated with a key, filehash finds the object on thedisk, loads the value into R and returns it to the user. The package offerstwo formats for storing data on the disk: The values can be stored (1)concatenated together in a single file or (2) separately as a directory offiles

Kutergin A. High performance computing with R

Page 232: HPC in R

Working with vary large datasetsFilehash usage examples

#Connecting librarylibrary (filehash)#Creating hash-database on HDDDATA _ PATH <-

" E : / R _ works / file _ hash _ data _ strorage / db _ test "DATA _ PATHdbCreate(DATA _ PATH)#Initializing link to our hash-databasedb <- dbInit(DATA _ PATH)#Load matrix to our database#Dimantionsits = 3000000; dim = 10dbInsert(db , " our _ big _ matrix ",

matrix ( rnorm (its * dim ),its , dim ))

Kutergin A. High performance computing with R

Page 233: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 234: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 235: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 236: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 237: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 238: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 239: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 240: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 241: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 242: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 243: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 244: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 245: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 246: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 247: HPC in R

Final words, some useful references and contactsSome useful references

This are some useful links:The book The Art of R programming -http://heather.cs.ucdavis.edu/~matloff/132/NSPpart.pdf

The book Econometrix in R - http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf

R Installation and Administration -http://cran.r-project.org/doc/manuals/R-admin.html

Very interesting presentation about HPC in R!!! -http://www.slideshare.net/bytemining/r-hpc

Integrated storage of R-posts - http://www.r-bloggers.com/Page of the commercial R-project -http://www.revolutionanalytics.com/

There are many other sites... If you have a problem, just ask Googl: Howto "here formulation of your problem"in R

Kutergin A. High performance computing with R

Page 248: HPC in R

Final words, some useful references and contactsFinal words and contacts

Well... this presentation is only the beginning of my work in this direction.This is only my first try. I will continue this work and will be addingfuture versions of this presentation with new materials and examples assoon as i have more free time. Also, about quality of this version of thepresentation... It is my first experience with LaTex system, so don’t judgeme harshly. If you are interesting in this scope or have some ideas, youcan just write me. I am open for discussion. This is my contacts list:

email: [email protected] page: facebook.com/aleksey.kuterginvk page: vk.com/aleksey_v_kutergin

Kutergin A. High performance computing with R