Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Interfacing C++ code from R
Søren Højsgaard
Department of Mathematical Sciences
Aalborg University, Denmark
November 20, 2012
Printed: November 20, 2012 File: interfaceCpp-slides.tex
2
Contents
1 Calling C++ from R 3
2 Some important libraries and packages 4
3 Example: The exponential function 63.1 Using Rcpp together with inline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Example: Calculating Fibonacci recursively 12
5 Example: Matrix multiplication 155.1 Using Rcpp together with inline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.2 Using RcppArmadillo together with inline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185.3 Benchmarking – III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6 Compiling without using inline 216.1 Example: Matrix multiplication using Rcpp . . . . . . . . . . . . . . . . . . . . . . . . . . . 256.2 Example: Matrix multiplication using RcppArmadillo . . . . . . . . . . . . . . . . . . . . . . 276.3 Example: Inverting a symmetric positive definite matrix using RcppArmadillo . . . . . . . . . . 28
7 Further examples on using RcppArmadillo with inline 307.1 Example: Extracting submatrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8 Calling R from C++ 33
9 Building packages using Rcpp libraries 36
10 EXERCISES 37
3
1 Calling C++ from R
C++ code can be called from R.
• Easily done using the Rcpp package. (There are other ways not to bediscussed here).
• The Rcpp package combines nicely with the inline package.
• There are several existing libraries available with a C++ interface that we mayuse – instead of “reinventing the wheel” (as we did when creating our own
matrix multiplication functions).
4
2 Some important libraries and packages
• Armadillo is an open-source C++ linear algebra library (matrix maths) aimingtowards a good balance between speed and ease of use.
http://arma.sourceforge.net/
R interface via RcppArmadillo
• Eigen is a C++ template library for linear algebra: matrices, vectors, numericalsolvers, and related algorithms.
http://eigen.tuxfamily.org/
R interface via RcppEigen
• The GNU Scientific Library (GSL) is a numerical library for C and C++programmers.
http://www.gnu.org/software/gsl/
R interface via RcppGSL
http://arma.sourceforge.net/http://eigen.tuxfamily.org/http://www.gnu.org/software/gsl/
5
Notice:
• To use these packages we must program in C++ rather than in C.
• Dirk Eddelbuettel provides many many examples on his website:http://dirk.eddelbuettel.com/code/rcpp.html
http://dirk.eddelbuettel.com/code/rcpp.html
6
3 Example: The exponential function
Recall our implementation of the exponential function in R:
> expfunR
7
A pure C implementation (which ignores the numerical difficulties when x < 0) is:
1 #include
2 double C_expfun2 (double x){
3 double ans=1.0, term =1.0, eps=1e-16;
4 int n=0;
5 while (fabs(term)>eps){
6 n++;
7 term = term * x / n;
8 ans = ans + term;
9 }
10 return(ans);
11 }
8
3.1 Using Rcpp together with inline
> src library(inline)
> expfunC
9
An alternative implementation is to use the original function (accounting for
nummerical difficulties when x < 0 has been resolved):
> incltxt expfunC2
10
> xx c(expfunR(xx), expfunC(xx), expfunC2(xx))
[1] 2.718282 2.718282 2.718282
> xx c(expfunR(xx), expfunC(xx), expfunC2(xx))
[1] 485165195 485165195 485165195
> xx c(expfunR(xx), expfunC(xx), expfunC2(xx))
[1] 2.061154e-09 5.621884e-09 2.061154e-09
> (expfunC(-20)-exp(-20))/exp(-20)
[1] 1.727543
> (expfunC2(-20)-exp(-20))/exp(-20)
[1] 2.006596e-16
>
11
> library(rbenchmark)
> cols N benchmark(expfunR(10), expfunC(10), expfunC2(10),
+ columns=cols, order="relative", replications=N)
test replications elapsed relative
2 expfunC(10) 20000 0.08 1.0
3 expfunC2(10) 20000 0.08 1.0
1 expfunR(10) 20000 2.44 30.5
12
4 Example: Calculating Fibonacci recursively
The standard definition of the Fibonacci sequence is fn = fn−1 + fn−2 where
f0 = 0, f1 = 1.
A simple recursive implementation in R is:
> fibR
13
Easy to write fast C++ version:
> incltxt fibC
14
> library(rbenchmark)
> cols M benchmark(fibR(M), fibC(M),
+ columns=cols, order="relative", replications=1)
test replications elapsed relative
1 fibR(M) 1 7.86 NA
2 fibC(M) 1 0.00 NA
15
5 Example: Matrix multiplication
Recall our first version of the matrix multiplication function:
1 /* File: matprod1.c: Calculate the product of matrices X and Y */
2 void matprod1(double *X, int *dimX , double *Y, int *dimY , double *ans){
3 double sum;
4 int ii, jj, kk;
5 int nrX=dimX[0], ncX=dimX[1], nrY=dimY[0], ncY=dimY [1];
6
7 for (ii=0; ii
16
An interface using SEXPs is:
1 /* File: matprod2.c: Calculates the product of matrices X and Y */
2 #include
3 #include "matprod1.h"
4
5 SEXP matprod2(SEXP X, SEXP Y) {
6 int nprot =0;
7 PROTECT(X = AS_NUMERIC(X)); nprot ++; /* Digest SEXPs from R */
8 PROTECT(Y = AS_NUMERIC(Y)); nprot ++;
9 double *xptr; xptr = REAL(X);
10 double *yptr; yptr = REAL(Y);
11 int *dimX; dimX = INTEGER(GET_DIM(X));
12 int *dimY; dimY = INTEGER(GET_DIM(Y));
13 SEXP ans; /* Create SEXP to hold result */
14 PROTECT(ans = allocMatrix(REALSXP , dimX[0], dimY [1])); nprot ++;
15 double *ansptr; ansptr = REAL(ans);
16 matprod1(xptr , dimX , yptr , dimY , ansptr ); /* Calculate product */
17 UNPROTECT(nprot); /* Wrap up; */
18 return(ans); /* Return the result to R */
19 }
where
1 void matprod1(double *X, int *dimX , double *Y, int *dimY , double *ans);
17
5.1 Using Rcpp together with inline
With Rcpp matrices can be indexed the usual way:
> src library(inline)
> mprod5_inline_Rcpp
18
5.2 Using RcppArmadillo together with inline
The Armadillo library is an excellent C++ package for linear algebra and
RcppArmadillo makes this easy
> src mprod6_inline_RcppArma
19
5.3 Benchmarking – III
> library(rbenchmark)
> cols N A
20
Tentative conclusions on the benchmarking:
• Speedwise, .Call() is better than .C().
• Speedwise, C is better than C++.
• Execution time of a program must be traded off with the programming time tomake the program work.
• For larger matrices our “own homegrown”C code seems to loose to the othercompetitors.
21
6 Compiling without using inline
Rcpp based code can be compiled using R CMD SHLIB.
To do so, one must tell the compiler where to find the headers and tell the linker
which libraries to link against and where to find them. One way of doing so is by
creating a Makevars file.
Below are the Makevars files that get things to work on window and linux:
Using Rcpp: Using Rcpp alone, a Makevars file with these lines work on both
linux and windows:
PKG_LIBS=`Rscript -e "Rcpp:::LdFlags()"`PKG_CXXFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`
22
Using RcppArmadillo: For compilation on windows, the file Makevars.win
contains these lines:
PKG_LIBS = $(BLAS_LIBS) $(FLIBS) $(LAPACK_LIBS) \
$(shell "Rscript.exe" -e "Rcpp:::LdFlags()")
PKG_CPPFLAGS = -I${R_HOME}/include -I${R_HOME}/library/Rcpp/include \
-I${R_HOME}/library/RcppArmadillo/include -I. -DNDEBUG
For compilation on linux, the Makevars file contains the lines:
PKG_LIBS = $(BLAS_LIBS) $(FLIBS) $(LAPACK_LIBS) \
$(shell "Rscript" -e "Rcpp:::LdFlags()")
## If Rcpp etc. are installed in /usr/local/lib/R/site-library
R_SITE=/usr/local/lib/R/site-library
PKG_CPPFLAGS = -I${R_HOME}/include -I${R_SITE}/Rcpp/include \
-I${R_SITE}/RcppArmadillo/include -I. -DNDEBUG
## If Rcpp etc. are installed in /usr/lib/R/ use instead:
### PKG_CPPFLAGS = -I${R_HOME}/include -I${R_HOME}/library/Rcpp/include \
### -I${R_HOME}/library/RcppArmadillo/include -I. -DNDEBUG
23
Using RcppEigen: For compilation on windows, the file Makevars.win contains
these lines:
PKG_LIBS = $(BLAS_LIBS) $(FLIBS) \
$(shell "Rscript.exe" -e "Rcpp:::LdFlags()")
PKG_CPPFLAGS = -I${R_HOME}/library/RcppEigen/include \
-I${R_HOME}/library/Rcpp/include -I. -DNDEBUG
For compilation on linux, the Makevars file contains the lines
PKG_LIBS = `$(R_HOME)/bin/Rscript -e "Rcpp:::LdFlags()"`## If Rcpp etc. are installed in /usr/local/lib/R/site-library
R_SITE=/usr/local/lib/R/site-library
PKG_CPPFLAGS = -I${R_HOME}/include -I${R_SITE}/Rcpp/include \
-I${R_SITE}/RcppEigen/include -I. -DNDEBUG
## If Rcpp etc. are installed in /usr/lib/R/ use instead:
## PKG_CPPFLAGS = -I${R_HOME}/library/Rcpp/include \
## -I${R_HOME}/library/RcppEigen/include -I. -DNDEBUG
24
NOTICE: For easier reading, a long line is broken up by putting ”\enter” as thelast characters of this line and the rest of this (logical) line on the next (physical)
line. (Hence there must be no white space following the backslash).
25
6.1 Example: Matrix multiplication using Rcpp
With Rcpp matrices can be indexed the usual way: Consider the file:
1 // File: matprod7.cpp
2 #include
3
4 RcppExport SEXP matprod7( SEXP X_, SEXP Y_){
5 Rcpp:: NumericMatrix X(X_);
6 Rcpp:: NumericMatrix Y(Y_);
7 Rcpp:: NumericMatrix ans (X.nrow(), Y.ncol ());
8 int ii, jj, kk;
9 for (ii=0; ii
26
To compile this file, first create a file named Makevars with the content (notice
the back quotes):
PKG_LIBS=`Rscript -e "Rcpp:::LdFlags()"`PKG_CXXFLAGS=`Rscript -e "Rcpp:::CxxFlags()"`
Next, compile the file as usual:
R CMD SHLIB src/matprod7.cpp
which creates matprod7.dll / matprod7.so.
> mprod7_Rcpp A mprod7_Rcpp(A, B)
[,1] [,2] [,3]
[1,] 30 66 102
[2,] 36 81 126
[3,] 42 96 150
> dyn.unload("src/matprod7.dll")
27
6.2 Example: Matrix multiplication using RcppArmadillo
1 #include
2 #include
3
4 RcppExport SEXP matprod8( SEXP X_, SEXP Y_ ){
5 arma::mat X = Rcpp::as(X_);
6 arma::mat Y = Rcpp::as(Y_);
7 arma::mat ans = X * Y;
8 return Rcpp::wrap(ans);
9 }
R CMD SHLIB src_arma/matprod8.cpp
> dyn.load("src_arma/matprod8.dll")
> .Call("matprod8", A, B)
[,1] [,2] [,3]
[1,] 30 66 102
[2,] 36 81 126
[3,] 42 96 150
> dyn.unload("src_arma/matprod8.dll")
28
6.3 Example: Inverting a symmetric positive definite matrix using
RcppArmadillo
Consider inverting a symmetric
positive defininite matrix. An R approach to doing so is via a Cholesky decomposition
> spdinv_R
29
Benchmarks
> library(MASS)
> library(Matrix)
> PP X Xm library(rbenchmark)
> cols dyn.load("src_arma/spdinv-arma.dll")
> benchmark(spdinv_R(X), solve(X), solve(Xm), .Call("C_spdinv_arma", X),
+ columns=cols, replications=10000)
test replications elapsed relative
4 .Call("C_spdinv_arma", X) 10000 0.06 1.000
2 solve(X) 10000 0.92 15.333
3 solve(Xm) 10000 0.26 4.333
1 spdinv_R(X) 10000 0.88 14.667
> dyn.unload("src_arma/spdinv-arma.dll")
30
7 Further examples on using RcppArmadillo with inline
31
7.1 Example: Extracting submatrices
> library(inline)
> library(Rcpp)
> submat
32
> M submat(M, 0, 0)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> submat(M, 1:2, 0)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
> submat(M, 0, 2:3)
[,1] [,2]
[1,] 4 7
[2,] 5 8
[3,] 6 9
> submat(M, 1:2, 2:3)
[,1] [,2]
[1,] 4 7
[2,] 5 8
33
8 Calling R from C++
> toString_ toString_(c("foo","bar","bob"),";")
[1] "foo;bar;bob"
> toString_(c(1,2,3),";")
[1] "1;2;3"
34
> get_index_ a str(a)
List of 4
$ : int [1:2] 1 2
$ : int [1:2] 2 3
$ : int [1:3] 1 3 5
$ : int [1:3] 1 3 NA
35
36
9 Building packages using Rcpp libraries
Your friends are:
> library(Rcpp)
> Rcpp.package.skeleton()
> library(RcppArmadillo)
> RcppArmadillo.package.skeleton
37
10 EXERCISES
1. Using inline and RcppArmadillo, implement a function that calculates the
conditional variance in a multivariate normal distribution, i.e.
V ar(Ya|Yb) = Σaa − ΣabΣ−1bb Σba
2. Compare the performance in computing time with a pure R implementation:
> condVarR