Interfacing C++ code from R - People.math.aau.dksorenh/teaching/2012-ASC/day...Interfacing C++ code from R S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg University,

Interfacing C++ code from R

Søren Højsgaard

Department of Mathematical Sciences

Aalborg University, Denmark

November 20, 2012

Printed: November 20, 2012 File: interfaceCpp-slides.tex

2

Contents

1 Calling C++ from R 3

2 Some important libraries and packages 4

3 Example: The exponential function 63.1 Using Rcpp together with inline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Example: Calculating Fibonacci recursively 12

5 Example: Matrix multiplication 155.1 Using Rcpp together with inline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.2 Using RcppArmadillo together with inline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185.3 Benchmarking – III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

6 Compiling without using inline 216.1 Example: Matrix multiplication using Rcpp . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.2 Example: Matrix multiplication using RcppArmadillo . . . . . . . . . . . . . . . . . . . . . . 276.3 Example: Inverting a symmetric positive definite matrix using RcppArmadillo . . . . . . . . . . 28

7 Further examples on using RcppArmadillo with inline 307.1 Example: Extracting submatrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

8 Calling R from C++ 33

9 Building packages using Rcpp libraries 36

10 EXERCISES 37

3

1 Calling C++ from R

C++ code can be called from R.

• Easily done using the Rcpp package. (There are other ways not to bediscussed here).

• The Rcpp package combines nicely with the inline package.

• There are several existing libraries available with a C++ interface that we mayuse – instead of “reinventing the wheel” (as we did when creating our own

matrix multiplication functions).

4

2 Some important libraries and packages

• Armadillo is an open-source C++ linear algebra library (matrix maths) aimingtowards a good balance between speed and ease of use.

http://arma.sourceforge.net/

R interface via RcppArmadillo

• Eigen is a C++ template library for linear algebra: matrices, vectors, numericalsolvers, and related algorithms.

http://eigen.tuxfamily.org/

R interface via RcppEigen

• The GNU Scientific Library (GSL) is a numerical library for C and C++programmers.

http://www.gnu.org/software/gsl/

R interface via RcppGSL

http://arma.sourceforge.net/http://eigen.tuxfamily.org/http://www.gnu.org/software/gsl/

5

Notice:

• To use these packages we must program in C++ rather than in C.

• Dirk Eddelbuettel provides many many examples on his website:http://dirk.eddelbuettel.com/code/rcpp.html

http://dirk.eddelbuettel.com/code/rcpp.html

6

3 Example: The exponential function

Recall our implementation of the exponential function in R:

> expfunR

7

A pure C implementation (which ignores the numerical difficulties when x < 0) is:

1 #include

2 double C_expfun2 (double x){

3 double ans=1.0, term =1.0, eps=1e-16;

4 int n=0;

5 while (fabs(term)>eps){

6 n++;

7 term = term * x / n;

8 ans = ans + term;

9 }

10 return(ans);

11 }

8

3.1 Using Rcpp together with inline

> src library(inline)

> expfunC

9

An alternative implementation is to use the original function (accounting for

nummerical difficulties when x < 0 has been resolved):

> incltxt expfunC2

10

> xx c(expfunR(xx), expfunC(xx), expfunC2(xx))

[1] 2.718282 2.718282 2.718282


[1] 485165195 485165195 485165195


[1] 2.061154e-09 5.621884e-09 2.061154e-09

> (expfunC(-20)-exp(-20))/exp(-20)

[1] 1.727543

> (expfunC2(-20)-exp(-20))/exp(-20)

[1] 2.006596e-16

>

11

> library(rbenchmark)

> cols N benchmark(expfunR(10), expfunC(10), expfunC2(10),

+ columns=cols, order="relative", replications=N)

test replications elapsed relative

2 expfunC(10) 20000 0.08 1.0

3 expfunC2(10) 20000 0.08 1.0

1 expfunR(10) 20000 2.44 30.5

12

4 Example: Calculating Fibonacci recursively

The standard definition of the Fibonacci sequence is fn = fn−1 + fn−2 where

f0 = 0, f1 = 1.

A simple recursive implementation in R is:

> fibR

13

Easy to write fast C++ version:

> incltxt fibC

14


> cols M benchmark(fibR(M), fibC(M),

+ columns=cols, order="relative", replications=1)


1 fibR(M) 1 7.86 NA

2 fibC(M) 1 0.00 NA

15

5 Example: Matrix multiplication

Recall our first version of the matrix multiplication function:

1 /* File: matprod1.c: Calculate the product of matrices X and Y */

2 void matprod1(double *X, int *dimX , double *Y, int *dimY , double *ans){

3 double sum;

4 int ii, jj, kk;

5 int nrX=dimX[0], ncX=dimX[1], nrY=dimY[0], ncY=dimY [1];

6

7 for (ii=0; ii

16

An interface using SEXPs is:

1 /* File: matprod2.c: Calculates the product of matrices X and Y */

2 #include

3 #include "matprod1.h"

4

5 SEXP matprod2(SEXP X, SEXP Y) {

6 int nprot =0;

7 PROTECT(X = AS_NUMERIC(X)); nprot ++; /* Digest SEXPs from R */

8 PROTECT(Y = AS_NUMERIC(Y)); nprot ++;

9 double *xptr; xptr = REAL(X);

10 double *yptr; yptr = REAL(Y);

11 int *dimX; dimX = INTEGER(GET_DIM(X));

12 int *dimY; dimY = INTEGER(GET_DIM(Y));

13 SEXP ans; /* Create SEXP to hold result */

14 PROTECT(ans = allocMatrix(REALSXP , dimX[0], dimY [1])); nprot ++;

15 double *ansptr; ansptr = REAL(ans);

16 matprod1(xptr , dimX , yptr , dimY , ansptr ); /* Calculate product */

17 UNPROTECT(nprot); /* Wrap up; */

18 return(ans); /* Return the result to R */

19 }

where

1 void matprod1(double *X, int *dimX , double *Y, int *dimY , double *ans);

17

5.1 Using Rcpp together with inline

With Rcpp matrices can be indexed the usual way:

> src library(inline)

> mprod5_inline_Rcpp

18

5.2 Using RcppArmadillo together with inline

The Armadillo library is an excellent C++ package for linear algebra and

RcppArmadillo makes this easy

> src mprod6_inline_RcppArma

19

5.3 Benchmarking – III


> cols N A

20

Tentative conclusions on the benchmarking:

• Speedwise, .Call() is better than .C().

• Speedwise, C is better than C++.

• Execution time of a program must be traded off with the programming time tomake the program work.

• For larger matrices our “own homegrown”C code seems to loose to the othercompetitors.

21

6 Compiling without using inline

Rcpp based code can be compiled using R CMD SHLIB.

To do so, one must tell the compiler where to find the headers and tell the linker

which libraries to link against and where to find them. One way of doing so is by

creating a Makevars file.

Below are the Makevars files that get things to work on window and linux:

Using Rcpp: Using Rcpp alone, a Makevars file with these lines work on both

linux and windows:

PKG_LIBS=`Rscript -e "Rcpp:::LdFlags()"`PKG_CXXFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`

22

Using RcppArmadillo: For compilation on windows, the file Makevars.win

contains these lines:

PKG_LIBS = $(BLAS_LIBS) $(FLIBS) $(LAPACK_LIBS) \

$(shell "Rscript.exe" -e "Rcpp:::LdFlags()")

PKG_CPPFLAGS = -I${R_HOME}/include -I${R_HOME}/library/Rcpp/include \

-I${R_HOME}/library/RcppArmadillo/include -I. -DNDEBUG

For compilation on linux, the Makevars file contains the lines:

PKG_LIBS = $(BLAS_LIBS) $(FLIBS) $(LAPACK_LIBS) \

$(shell "Rscript" -e "Rcpp:::LdFlags()")

## If Rcpp etc. are installed in /usr/local/lib/R/site-library

R_SITE=/usr/local/lib/R/site-library

PKG_CPPFLAGS = -I${R_HOME}/include -I${R_SITE}/Rcpp/include \

-I${R_SITE}/RcppArmadillo/include -I. -DNDEBUG

## If Rcpp etc. are installed in /usr/lib/R/ use instead:

### PKG_CPPFLAGS = -I${R_HOME}/include -I${R_HOME}/library/Rcpp/include \

### -I${R_HOME}/library/RcppArmadillo/include -I. -DNDEBUG

23

Using RcppEigen: For compilation on windows, the file Makevars.win contains

these lines:

PKG_LIBS = $(BLAS_LIBS) $(FLIBS) \

$(shell "Rscript.exe" -e "Rcpp:::LdFlags()")

PKG_CPPFLAGS = -I${R_HOME}/library/RcppEigen/include \

-I${R_HOME}/library/Rcpp/include -I. -DNDEBUG

For compilation on linux, the Makevars file contains the lines

PKG_LIBS = `$(R_HOME)/bin/Rscript -e "Rcpp:::LdFlags()"`## If Rcpp etc. are installed in /usr/local/lib/R/site-library

R_SITE=/usr/local/lib/R/site-library

PKG_CPPFLAGS = -I${R_HOME}/include -I${R_SITE}/Rcpp/include \

-I${R_SITE}/RcppEigen/include -I. -DNDEBUG

## If Rcpp etc. are installed in /usr/lib/R/ use instead:

## PKG_CPPFLAGS = -I${R_HOME}/library/Rcpp/include \

## -I${R_HOME}/library/RcppEigen/include -I. -DNDEBUG

24

NOTICE: For easier reading, a long line is broken up by putting ”\enter” as thelast characters of this line and the rest of this (logical) line on the next (physical)

line. (Hence there must be no white space following the backslash).

25

6.1 Example: Matrix multiplication using Rcpp

With Rcpp matrices can be indexed the usual way: Consider the file:

1 // File: matprod7.cpp

2 #include

3

4 RcppExport SEXP matprod7( SEXP X_, SEXP Y_){

5 Rcpp:: NumericMatrix X(X_);

6 Rcpp:: NumericMatrix Y(Y_);

7 Rcpp:: NumericMatrix ans (X.nrow(), Y.ncol ());

8 int ii, jj, kk;

9 for (ii=0; ii

26

To compile this file, first create a file named Makevars with the content (notice

the back quotes):

PKG_LIBS=`Rscript -e "Rcpp:::LdFlags()"`PKG_CXXFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`

Next, compile the file as usual:

R CMD SHLIB src/matprod7.cpp

which creates matprod7.dll / matprod7.so.

> mprod7_Rcpp A mprod7_Rcpp(A, B)

[,1] [,2] [,3]

[1,] 30 66 102

[2,] 36 81 126

[3,] 42 96 150

> dyn.unload("src/matprod7.dll")

27

6.2 Example: Matrix multiplication using RcppArmadillo

1 #include

2 #include

3

4 RcppExport SEXP matprod8( SEXP X_, SEXP Y_ ){

5 arma::mat X = Rcpp::as(X_);

6 arma::mat Y = Rcpp::as(Y_);

7 arma::mat ans = X * Y;

8 return Rcpp::wrap(ans);

9 }

R CMD SHLIB src_arma/matprod8.cpp

> dyn.load("src_arma/matprod8.dll")

> .Call("matprod8", A, B)

[,1] [,2] [,3]

[1,] 30 66 102

[2,] 36 81 126

[3,] 42 96 150

> dyn.unload("src_arma/matprod8.dll")

28

6.3 Example: Inverting a symmetric positive definite matrix using

RcppArmadillo

Consider inverting a symmetric

positive defininite matrix. An R approach to doing so is via a Cholesky decomposition

> spdinv_R

29

Benchmarks

> library(MASS)

> library(Matrix)

> PP X Xm library(rbenchmark)

> cols dyn.load("src_arma/spdinv-arma.dll")

> benchmark(spdinv_R(X), solve(X), solve(Xm), .Call("C_spdinv_arma", X),

+ columns=cols, replications=10000)


4 .Call("C_spdinv_arma", X) 10000 0.06 1.000

2 solve(X) 10000 0.92 15.333

3 solve(Xm) 10000 0.26 4.333

1 spdinv_R(X) 10000 0.88 14.667

> dyn.unload("src_arma/spdinv-arma.dll")

30

7 Further examples on using RcppArmadillo with inline

31

7.1 Example: Extracting submatrices

> library(inline)

> library(Rcpp)

> submat

32

> M submat(M, 0, 0)

[,1] [,2] [,3] [,4]

[1,] 1 4 7 10

[2,] 2 5 8 11

[3,] 3 6 9 12

> submat(M, 1:2, 0)

[,1] [,2] [,3] [,4]

[1,] 1 4 7 10

[2,] 2 5 8 11

> submat(M, 0, 2:3)

[,1] [,2]

[1,] 4 7

[2,] 5 8

[3,] 6 9

> submat(M, 1:2, 2:3)

[,1] [,2]

[1,] 4 7

[2,] 5 8

33

8 Calling R from C++

> toString_ toString_(c("foo","bar","bob"),";")

[1] "foo;bar;bob"

> toString_(c(1,2,3),";")

[1] "1;2;3"

34

> get_index_ a str(a)

List of 4

$ : int [1:2] 1 2

$ : int [1:2] 2 3

$ : int [1:3] 1 3 5

$ : int [1:3] 1 3 NA

36

9 Building packages using Rcpp libraries

Your friends are:

> library(Rcpp)

> Rcpp.package.skeleton()

> library(RcppArmadillo)

> RcppArmadillo.package.skeleton

37

10 EXERCISES

1. Using inline and RcppArmadillo, implement a function that calculates the

conditional variance in a multivariate normal distribution, i.e.

V ar(Ya|Yb) = Σaa − ΣabΣ−1bb Σba

2. Compare the performance in computing time with a pure R implementation:

> condVarR

Documents

Interfacing C++ code from R - People.math.aau.dksorenh/teaching/2012-ASC/day...Interfacing C++ code from R S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg University,