Romain Francois, Professional R Enthusiast

To content | To menu | To search

Friday, January 8 2010

External pointers with Rcpp

One of the new features of Rcpp is the XPtr class template, which lets you treat an R external pointer as a regular pointer. For more information on external pointers, see Writing R extensions.

To use them, first we need a pointer to some C++ data structure, we'll use a pointer to a vector<int> :

/* creating a pointer to a vector<int> */
std::vector<int>* v = new std::vector<int> ;
v->push_back( 1 ) ;
v->push_back( 2 ) ;

Then, using the XPtr template class we wrap the pointer in an R external pointer

/* wrap the pointer as an external pointer */
/* this automatically protected the external pointer from R garbage 
   collection until p goes out of scope. */
Rcpp::XPtr< std::vector<int> > p(v, true) ;

The first parameter of the constructor is the actual (sometimes called dumb) pointer, and the second parameter is a flag indicating that we need to register a delete finalizer with the external pointer. When the external pointer goes out of scope, it becomes subject to garbage collection, and when it is garbage collected, the finalizer is called, which then calls delete on the dumb pointer.

Wrapping it all together thanks to the inline package, here's a function that creates an external pointer to a vector<int> and return it to R

        funx <- cfunction(signature(), '
                /* creating a pointer to a vector<int> */
                std::vector<int>* v = new std::vector<int> ;
                v->push_back( 1 ) ;
                v->push_back( 2 ) ;
                
                /* wrap the pointer as an external pointer */
                /* this automatically protected the external pointer from R garbage 
                   collection until p goes out of scope. */
                Rcpp::XPtr< std::vector<int> > p(v, true) ;
                
                /* return it back to R, since p goes out of scope after the return 
                   the external pointer is no more protected by p, but it gets 
                   protected by being on the R side */
                return( p ) ;
        ', Rcpp=TRUE, verbose=FALSE)
        xp <- funx()

At that point, xp is an external pointer object

> xp
<pointer: 0x9c850c8>
> typeof( xp )
[1] "externalptr"

Then, we can pass it back to the C(++) layer, an continue to work with the wrapped stl vector of ints. For this we use the other constructor for the XPtr class template, that takes an R object (SEXP) of sexp type EXTPTRSXP.


/* wrap the SEXP as a smart external pointer */
Rcpp::XPtr< std::vector<int> > p(x) ;

/* use p as a 'dumb' pointer */
p->front() ;

Again, we can wrap this up for quick prototyping using the inline package :

        # passing the pointer back to C++
        funx <- cfunction(signature(x = "externalptr" ), '
                /* wrapping x as smart external pointer */
                /* The SEXP based constructor does not protect the SEXP from 
                   garbage collection automatically, it is already protected 
                   because it comes from the R side, however if you want to keep 
                   the Rcpp::XPtr object on the C(++) side
                   and return something else to R, you need to protect the external
                   pointer, by using the protect member function */
                Rcpp::XPtr< std::vector<int> > p(x) ;
                
                /* just return the front of the vector as a SEXP */
                return( Rcpp::wrap( p->front() ) ) ;
        ', Rcpp=TRUE, verbose=FALSE)
        front <- funx(xp)
> front
[1] 1

The example is extracted from one unit tests that we use in Rcpp, see the full example :

> system.file( "unitTests", "runit.XPTr.R", package = "Rcpp" )
[1] "/usr/local/lib/R/library/Rcpp/unitTests/runit.XPTr.R"

See also the announcement for the release of Rcpp 0.7.1 here to get a list of new features, or wait a few days to see version 0.7.2.

Using the XPtr class template is the bread and butter of the CPP package I blogged about here

Tuesday, December 29 2009

C++ exceptions at the R level

The feature described in this post is no longer valid with recent versions of Rcpp. Setting a terminate handler does not work reliably on windows, so we don't do it at all anymore. Exceptions need to be caught and relayed to R. Bracketing the code with BEGIN_RCPP / END_RCPP does it simply. See the Rcpp-introduction vignette for details.

I've recently offered an extra set of hands to Dirk to work on the Rcpp package, this serves a good excuse to learn more about C++

Exception management was quite high on my list. C++ has nice exception handling (well not as nice as java, but nicer than C).

With previous versions of Rcpp, the idiom was to wrap up everything in a try/catch block and within the catch block, call the Rf_error function to send an R error, equivalent of calling stop. Now things have changed and, believe it or not, you can now catch a C++ exception at the R level, using the standard tryCatch mechanism

, so for example when you throw a C++ exception (inheriting from the class std::exception) at the C++ level, and the exception is not picked up by the C++ code, it automatically sends an R condition that contain the message of the exception (what the what member function of std::exception gives) as well as the class of the exception (including namespace)

This, combined with the new inline support for Rcpp, allows to run this code, (also available in the inst/examples/RcppInline directory of Rcpp)

require(Rcpp)
require(inline)
funx <- cfunction(signature(), '
throw std::range_error("boom") ;
return R_NilValue ;
', Rcpp=TRUE, verbose=FALSE)

Here, we create the funx "function" that compiles itself into a C++ function and gets dynamically linked into R (thanks to the inline package). The relevant thing (at least for this post) is the throw statement. We throw a C++ exception of class "std::range_error" with the message "boom", and what follows shows how to catch it at the R level:

tryCatch(  funx(), "C++Error" = function(e){
    cat( sprintf( "C++ exception of class '%s' : %s\n", class(e)[1L], e$message  ) )
} )
# or using a direct handler 
tryCatch(  funx(), "std::range_error" = function(e){
        cat( sprintf( "C++ exception of class '%s' : %s\n", class(e)[1L], e$message  ) )
} )

... et voila

Under the carpet, the abi unmangling namespace is at work, and the function that grabs the uncaught exceptions is much inspired from the verbose terminate handler that comes with the GCC

Part of this was inspired from the new java exception handling that came with the version 0.8-0 of rJava, but cooked with C++ ingredients

Tuesday, December 22 2009

CPP package: exposing C++ objects

I've just started working on the new package CPP, as usual the project is maintained in r-forge. The package aims at exposing C++ classes at the R level, starting from classes from the c++ standard template library.

key to the package is the CPP function (much inspired from the J function of rJava). The CPP function builds an S4 object of class "C++Class". The "C++Class" currently is a placeholder wrapping the C++ class name, and defines the new method (again this trick or making new S4 generic comes from rJava). For example to create an R object that wraps up a std::vector<int>, one would go like this:

x <- new( CPP( "vector<int>" ) )

This is no magic and don't expect to be able to send anything to CPP (C++ does not have reflection capabilities), currently only these classes are defined : std::vector<int>, vector<double>, vector<raw> and set<int>

Because C++ does not offer reflection capabilities, we have to do something else to be able to invoke methods on the wrapped objects. Currently the approach that the package follows is a naming convention. The $ method create the name of the C routine it wants to call based on the C++ class the object is wrapping, the name of the method, and the types of the input parameters. So for example calling the size method for a vector<:int> object yields this routine name: "vector_int____size", calling the push_back method of the vector<double> class, passing an integer vector as the first parameter yields this signature : "vector_double____push_back___integer" .... (the CPP:::getRoutineSignature implements the convention)

Here is a full example using the set<int> class. Sets are a good example of a data structure that is not available in R. Basically it keeps its objects sorted

> # create the object
> x <- new( CPP("set<int>") )
> # insert data using the insert method
> # see : insert
> x$insert( sample( 1:20 ) )
> # ask for the size of the set
> x$size()
[1] 20
> # bring it back as an R classic integer vector
> as.vector( x )
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

Currently the package is my excuse to learn about the standard template library, and it is quite possible that the functionality will be merged into the Rcpp it currently depends on. Because of this volatility, I'll use the Rcpp-devel mailing list instead of creating a new one.

Friday, December 11 2009

new R package : bibtex

I've pushed to CRAN the package bibtex package

The package defines the read.bib function that reads a file in the bibtex format. The code is based on bibparse

The read.bib function generates an object of class citationList, just like utils::citation

Sunday, November 22 2009

new R package : highlight

I finally pushed highlight to CRAN, which should be available in a few days. The package uses the information gathered by the parser package to perform syntax highlighting of R code

The main function of the package is highlight, which takes a number of argument including :

  • file : the file in which the R code is
  • output : some output connection or file name where to write the result (The default is standard output)
  • renderer : a collection of function controlling how to render code into a given markup language

The package ships three functions that create such renderers

  • renderer_html : renders in html/css
  • renderer_latex: renders in latex
  • renderer_verbatim: does nothing

And additionally, the xterm256 package defines a renderer that allows syntax highlighting directly in the console (if the console knows xterm 256 colors)

Let's assume we have this code file (/tmp/code.R)

f <- function( x){
        x + rnorm(1)
}

g <- function(x){}
h <- function(x){}

Then we can syntax highlight it like this :

> highlight( "/tmp/code.R", renderer = renderer_html(), output = "/tmp/code.R.html" )
> highlight( "/tmp/code.R", renderer = renderer_latex(), output = "/tmp/code.R.latex" )

which makes these files : code.R.html and code.R.latex

The package also ships a sweave driver that can highlight code chunks in a sweave document, but I'll talk about this in another post

Monday, November 9 2009

LondonR slides

I was in london last week to present RemoteREngine at the LondonR user group sponsored by mango solutions.

Apart from minor technical details and upsetting someone because I did not mention that he once presented a much simpler solution to a quite different problem, it went pretty good and people were interested in what the package can do

Essentially, RemoteREngine is an implementation of REngine using java rmi (remote method invocation) for the data transport.

This allows a (or several) client java application to embed an R engine that lives in a different java virtual machine, perhaps on a different physical machine. In a way it is quite similar to the Rserve implementation of REngine, but rmi gives better control over the data transport and we get things Rserve does not currently do such as support for environments or references.

The slides are available here and will probably also make their way to the conference site at some point

Friday, October 9 2009

celebrating R commit #50000

Today, Brian Ripley commited the revision 50 000 into R svn repository.

------------------------------------------------------------------------
r50000 | ripley | 2009-10-09 10:34:17 +0200 (Fri, 09 Oct 2009) | 1 line
Changed paths:
   M /branches/R-2-10-branch/src/library/stats/R/plot.lm.R

port r49999 from trunk
------------------------------------------------------------------------
r49999 | ripley | 2009-10-09 10:33:28 +0200 (Fri, 09 Oct 2009) | 2 lines
Changed paths:
   M /trunk/src/library/stats/R/plot.lm.R

workaround for PR#13899 (that in the report is broken and fails make check!)

so it is time to celebrate and have some fun with the svn log to analyze the 50 000 commits ... with R of course.

data extraction

First we need to grab the full svn log, using command line svn, something like this:

$ svn log -v https://svn.r-project.org/R > rsvn.log

... or you can download it from my website if you don't have svn on your machine

now we need to read the data into R :

we might also be interested in release date, version number and size of the distribution of each R release that is archived on CRAN, which we can get like this :

graphics

now we can do some graphics. I'm using lattice here because I am familiar with it, but I'm sure interesting plots could be done using ggplot2, in fact checkout this post from Yihui Xie using ggplot2

First I need to define some helper panel functions I'll use in the plots below

Number of commits per day

commits_day.png

... split by author

commits_author_day.png

The number of commits per month

commits_month.png

... split by author

commits_author_month.png

blogroll

Saturday, September 12 2009

New R package: sos

Searching help pages of contributed packages just got easier with the release of the new sos package. This is a replacement for and substantial enhancement of the existing "RSiteSearch" package. To learn more about it, try vignette("sos")

We hope you find this as useful as we have.

Spencer Graves, Sundar Dorai-Raj, Romain Francois

Tuesday, September 8 2009

search the graph gallery from R

This is a short code snippet that is motivated by this thread on r-help yesterday. The gallery contains a search engine textbox (top-right) that can be used to search for content in the website using either its internal crude search engine or perform a google search restricted to the gallery.

Here we write a small R function that can be used to take advantage of the search engine, from R

rgg.search <- function( topic, engine = c("Google", "RGG") ){

    engine <- match.arg( engine )
    url <- URLencode( sprintf( "http://addictedtor.free.fr/graphiques/search.php?q=%s&engine=%s", topic, engine ) )
    browseURL( url )
}
rgg.search( "Andrews plot" ) 

new R package : ant

The ant package has been released to CRAN yesterday. As discussed in previous posts in this blog (here and here), the ant R package provides an R-aware version of the ant build tool from the apache project.

The package contains an R script that can be used to invoke ant with enough plumbing so that it can use R code during the build process. Calling the script is further simplified with the ant function included in the package.

$ Rscript -e "ant::ant()"

The simplest way to take advantage of this package is to add it to the Depends list of yours, include a java source tree somewhere in your package tree (most likely somewhere in the inst tree) with a build.xml file, and include a configure and configure.win script at the root of the package that contains something like this:

#!/bin/sh

cd inst/java_src
"${R_HOME}/bin/Rscript" -e "ant::ant()"
cd ../..

This will be further illustrated with the demo package helloJavaWorld in future posts

Thursday, September 3 2009

update on the ant package

I have updated the ant package I described yesterday in this blog to add several things

  • Now the R code related to <r-set> and <r-run> tasks can either be given as the code attribute or as the text inside the task
  • The R code has access to special variables to manipulate the current project (project) and the current task (self) which can be used to set properties, get properties, ...
  • The package contains ant ant function so that ant can be invoked using a simple Rscript call, see below

The package now includes a demonstrative build.xml file in the examples directory

Here is the result

Wednesday, September 2 2009

R capable version of ant

ant is an amazing build tool. I have been using ant for some time to build the java code that lives inside the src directories of my R packages, see this post for example.

The drawbacks of this approach are :

  • that it assumes ant is available on the system that builds the package
  • You cannot use R code within the ant build script

The ant package for R is developed to solve these two issues. The package is source-controlled in r-forge as part of the orchestra project

Once installed, you find an ant.R script in the exec directory of the package. This script is pretty similar to the usual shell script that starts ant, but it sets it so that it can use R with the following additional tasks

  • <r-run> : to run arbitrary R code
  • <r-set> : to set a property of the current project with the result of an R expression

Here is an example build file that demonstrate how to use these tasks

Here is what happens when we call the R special version of ant with this build script

$ `Rscript -e "cat( system.file( 'exec', 'ant.R', package = 'ant') )"`
Buildfile: build.xml

test:
     [echo] 
     [echo]   	R home        : /usr/local/lib/R
     [echo]   	R version     : R version 2.10.0 Under development (unstable) (2009-08-05 r49067)
     [echo]   	rJava home    : /usr/local/lib/R/library/rJava
     [echo]   	rJava version : 0.7-1
     [echo]  

BUILD SUCCESSFUL
Total time: 1 second

Tip: get java home from R with rJava

Assuming rJava is installed and works, it is possible to take advantage of its magic to get the path where java is installed:

$ Rscript --default-packages="methods,rJava" -e ".jinit(); .jcall( 'java/lang/System', 'S', 'getProperty', 'java.home' ) "
[1] "/opt/jdk/jre"

This is useful when you develop scripts that need to call a java program without assuming that java is on the path, or the JAVA_HOME environment variable is set, etc ...

Friday, August 28 2009

Combine R CMD build and junit

This is a post in the series Mixing R CMD build and ant. Previous posts have shown how to compile the java code that lives in the src directory of a package and how to document this code using javadoc.

This post tackles the problem of unit testing of the java functionality that is shipped as part of an R package. Java has several unit test platforms, we will use junit here, but similar things could be done with other systems such as testng, ...

The helloJavaWorld package now looks like this :

.
|-- DESCRIPTION
|-- NAMESPACE
|-- R
|   |-- helloJavaWorld.R
|   `-- onLoad.R
|-- inst
|   |-- doc
|   |   |-- helloJavaWorld.Rnw
|   |   |-- helloJavaWorld.pdf
|   |   `-- helloJavaWorld.tex
|   `-- java
|       |-- hellojavaworld-tests.jar
|       `-- hellojavaworld.jar
|-- man
|   `-- helloJavaWorld.Rd
`-- src
    |-- Makevars
    |-- build.xml
    |-- junit
    |   `-- HelloJavaWorld_Test.java
    |-- lib
    |   `-- junit-4.7.jar
    `-- src
        `-- HelloJavaWorld.java

9 directories, 15 files

We have added the src/lib directory that contains the junit library and the HelloJavaWorld_Test.java that contain a simple class with a unit test

And the ant build file has been changed in order to

  • build the junit test cases, see the build-testcases target
  • run the unit tests, see the test target
  • create nice html reports, see the report target

The package can be downloaded here

Coming next, handling of dependencies between java code that lives in different R packages

Sunday, August 9 2009

Completion for java objects

As indicated in this thread, completion after the dollar operator can be customized by defining a custom names method. Here I am showing how to take advantage of this to display fields and methods of java references (jobjRef objects from the rJava package)

Here it is in action (I hit tab twice after the dollar sign)

Wednesday, August 5 2009

Code Snippet : List of CRAN packages

This is a really simple code snippet that shows how to get the list of CRAN packages and their titles from the html page html page (toulouse mirror in this example).

...

Note that R has the available.packages function, but it does not give the titles of the packages

Tuesday, August 4 2009

R parser package on CRAN

The parser package has been released to CRAN, the package mainly defines a function parser that is similar to the usual R function parse, with the few following differences:

  • The information about the location of each token is structured differently, in a data frame
  • location is gathered for all symbols from the source code, including terminal symbols (tokens), comments
  • An equal sign is identified to be either an assignment, the declaration of a formal argument or the use of an argument

Here is an example file containing R source code that we are going to parse with parser

#' a roxygen comment
f <- function( x = 3 ){
	
	# a regular comment
	rnorm(10 ) + runif( 10 )
	
}

It is a very simple file, for illustration purpose. Let's look what to do with it with the parser package

The parser generates a list of expressions, just like the regular parse function, but the gain is the data attribute. This is a data frame where each token of the parse tree is a line. The id column identifies each line, and the parent column identifies the parent of the current line.

At the moment, only the forthcoming highlight package uses the parser package (actually the parser package has been factored out of highlight), but some anticipated uses of the package include:

  • rework the codetools package so that it tells source location of potential problems
  • code coverage in RUnit or svUnit
  • rework the roxygen parser

Monday, August 3 2009

R GUI page on the R wiki

I've started the process of moving the content of this page to the R wiki. The motivation is that the content will become dynamic and updated much more often, people can add their own project, we can have use cases of each gui, tutorials, feature comparison, ...

When we are ready, we will add an entry for the jedit plugin

Wednesday, July 8 2009

useR! slides

I've pushed my slides from the presentation I've given at useR! a few minutes ago here

RGG# 154: demo of atomic functions

Przemyslaw Biecek has submitted this graph (and also others I will add later) to the graphics gallery

graph_154.png

A list of examples for the atomic functions polygon(), segments(), symbols(), arrows(), curve(), abline(), points(), lines(). this figure is taken from the book Przewodnik po pakiecie R

- page 3 of 5 -