Romain Francois, Professional R Enthusiast

To content | To menu | To search

Tag - cplusplus

Entries feed - Comments feed

Monday, November 5 2012

OOP with Rcpp modules



The purpose of Rcpp modules has always been to make it easy to expose C++ functions and classes to R. Up to now, Rcpp modules did not have a way to declare inheritance between C++ classes. This is now fixed in the development version, and the next version of Rcpp will have a simple mechanism to declare inheritance.

Consider this simple example, we have a base class Shape with two virtual methods (area and contains) and two classes Circle and Rectangle) each deriving from Shape and representing a specific shape.

The classes might look like this:

And we can expose these classes to R using the following module declarative code:

It is worth noticing that:

  • The area and contains methods are exposed as part of the base Shape class
  • Classes Rectangle and Circle simply declare that they derive from Shape using the derives notation.

R code that uses these classes looks like this:

shapes.jpg

Thursday, October 25 2012

Rcpp modules more flexible


Rcpp modules just got more flexible (as of revision 3838 of Rcpp, to become 0.9.16 in the future).

modules have allowed exposing C++ classes for some time now, but developpers had to declare custom wrap and as specializations if they wanted their classes to be used as return type or argument type of a C++ function or method. This led to writing boilerplate code. The newest devel version allows for syntax like this:

The only thing the developper has to do is to declare the class using the macro RCPP_EXPOSED_CLASS. This will declare the appropriate class traits that Rcpp is using for internal implementations of as and wrap

One the example we can see three examples of the new functionality:

  • make_foo : this returns a Foo
  • cloner: this returns a Foo*
  • bla: uses a const Foo& as argument


Saturday, November 26 2011

int64: 64 bit integer vectors for R

google-64.png

The Google Open Source Programs Office sponsored me to create the new int64 package that has been released to CRAN a few days ago. The package has been mentionned in an article in the open source blog from Google.

The package defines classes int64 and uint64 that represent signed and unsigned 64 bit integer vectors. The package also allows conversion of several types (integer, numeric, character, logical) to 64 bit integer vectors, arithmetic operations as well as other standard group generic functions, and reading 64 bit integer vectors as a data.frame column using int64 or uint64 as the colClasses argument.

The package has a vignette that details its features, several examples are given in the usual help files. Once again, I've used RUnit for quality insurance about the package code

int64 has been developped so that 64 bit integer vectors are represented using only R data structures, i.e data is not represented as external pointers to some C++ object. Instead, each 64 bit integer is represented as a couple of regular 32 bit integers, each of them carrying half the bits of the underlying 64 bit integer. This was a choice by design so that 64 bit integer vectors can be serialized and used as data frame columns.

The package contains C++ headers that third party packages can used (via LinkingTo: int64) to use the C++ internals. This allows creation and manipulation of the objects in C++. The internals will be documented in another vignette for package developpers who wish to use the internals. For the moment, the main entry point is the C++ template class LongVector.

I'm particularly proud that Google trusted me to sponsor the development of int64. The next versions of packages Rcpp and RProtoBuf take advantage of the facilities of int64, e.g. Rcpp gains wrapping of C++ containers of 64 bit integers as R objects of classes int64 and uint64 and RProtoBuf improves handling of 64 bit integers in protobuf messages. More on this later

Wednesday, March 30 2011

Rcpp workshop in Chicago on April 28th

Overview

This year's R/Finance conference will be preceded by a full-day masterclass on Rcpp and related topics which will be held on Thursday, April 28, 2011, the Univ. of Illinois at Chicago campus.

Join Dirk Eddelbuettel and Romain Francois for six hours of detailed and hands-on instructions and discussions around Rcpp, inline, RInside, RcppArmadillo and other packages---in intimate small-group setting.

The full-day format allows to combine a morning introductory session with a more advanced afternoon session while leaving room for sufficient breaks. There will be about six hours of instructions, a one-hour lunch break and two half-hour coffee breaks.

Morning session: "A hands-on introduction to R and C++"

The morning session will provide a practical introduction to the Rcpp package (and other related packages). The focus will be on simple and straightforward applications of Rcpp in order to extend R and/or to significantly accelerate the execution of simple functions.

The tutorial will cover the inline package which permits embedding of self-contained C, C++ or Fortran code in R scripts. We will also discuss RInside to embed R code in C++ applications, as well as standard Rcpp extension packages such as RcppArmadillo for linear algebra and RcppGSL.

Afternoon session: "Advanced R and C++ topics"

This afternoon tutorial will provide a hands-on introduction to more advanced Rcpp features. It will cover topics such as writing packages that use Rcpp, how 'Rcpp modules' and the new R ReferenceClasses interact, and how 'Rcpp sugar' lets us write C++ code that is often as expressive as R code. Another possible topic, time permitting, may be writing glue code to extend Rcpp to other C++ projects.

We also hope to leave some time to discuss problems brought by the class participants.

Prerequisites

Knowledge of R as well as general programming knowledge; C or C++ knowledge is helpful but not required.

Users should bring a laptop set up so that R packages can be built. That means on Windows, Rtools needs to be present and working, and on OS X the Xcode package should be installed.

Registration

Registration is available via the R/Finance conference at

http://www.RinFinance.com/register/

or directly at RegOnline

http://www.regonline.com/930153

The cost is USD 500 for the whole day, and space will be limited.

Questions

Please contact us directly at RomainAndDirk@r-enthusiasts.com

Wednesday, December 1 2010

RcppGSL 0.1.0

Gnu

We released the first version of our RcppGSL package. RcppGSL extends Rcpp to help programmers code with the GNU Scientific Library (GSL).

The package contains template classes in the RcppGSL namespace that act as smart pointers to the associated GSL data structure. For example, a RcppGSL::vector<:double> object acts a smart pointer to a gsl_vector*. Having the pointer shadowed by a smart pointer allows us to take advantage of C++ features such as operator overloading, etc ... which for example allows us to extract an element from the GSL vector simply using [] instead of GSL functions gsl_vector_get and gsl_vector_set

The package contains a 11 pages vignette that explains the features in details, with examples. The vignette also discusses how to actually use RcppGSL, either in another package (preferred) or directly from the R prompt through the inline package.

Saturday, July 10 2010

Rcpp 0.8.4

Dirk uploaded Rcpp 0.8.4 to CRAN yesterday. This release quickly follows the release of Rcpp 0.8.3, because there was some building problems (particularly on the ppc arch on OSX).

Rcpp sugar

Sugar:Cubed

Already available in Rcpp 0.8.3, the new sugar feature was extended in 0.8.4 to cover more functions, and we have now started to adapt sugar for matrices with functions such as outer, row, diag, etc ...

Here is an example of using the sugar version of outer

NumericVector xx(x) ;
NumericVector yy(y);
NumericMatrix m = outer( xx, yy, std::plus<double>() ) ;
return m ;

This mimics the R code

> outer( x, y, "+" )

Here is the relevant extract of the NEWS file:

0.8.4   2010-07-09

o   new sugar vector functions: rep, rep_len, rep_each, rev, head, tail, diag
	
o	sugar has been extended to matrices: The Matrix class now extends the Matrix_Base template that implements CRTP. Currently sugar functions for matrices are: outer, col, row, lower_tri, upper_tri, diag

o   The unit tests have been reorganised into fewer files with one call each to cxxfunction() (covering multiple tests) resulting in a significant speedup

o	The Date class now uses the same mktime() replacement that R uses (based on original code from the timezone library by Arthur Olson) permitting wide dates ranges on all operating systems

o   The FastLM/example has been updated, a new benchmark based on the historical Longley data set has been added

o   RcppStringVector now uses std::vector<std::string> internally

o    setting the .Data slot of S4 objects did not work properly

Wednesday, June 30 2010

Rmetrics slides

I presented Rcpp at the Rmetrics conference earlier today, this was a really good opportunity to look back at all the work Dirk and I have been commiting into Rcpp.

I've uploaded my slides here (pdf) and on slideshare :

and some pictures on flickr:

Tuesday, June 8 2010

Rcpp 0.8.1

We released Rcpp 0.8.0 almost a month ago. It finalized our efforts in designing a better, faster and more natural API than any version of Rcpp ever before. The journey from Rcpp 0.7.0 to Rcpp 0.8.0 has mainly been a coding and testing effort for designing the API.

And now for something completely different

We have now started (with release 0.8.1 of Rcpp) a new development cycle towards the 0.9.0 version with two major goals in mind

  • We want to improve documentation. To that end Rcpp 0.8.1 includes 4 new vignettes. more on that later.
  • We want to cross the boundaries between R and C++. Rcpp 0.8.1 introduces Rcpp modules. Modules allows the programmer to expose C++ classes and functions at the R level, with great ease.

new vignettes

Rcpp-FAQ :Frequently Asked Questions about Rcpp collects some of the frequently asked questions from the mailing list and from private exchanges with many people.

Rcpp-extending: Extending Rcpp shows how to extend Rcpp converters Rcpp::wrap and Rcpp::as to user defined types (C++ classes defined in someone else's package and third party types (C++ classes defined in some third party library used by a package. The document is based on our experience developping the RcppArmadillo package

Rcpp-package : Writing a package that uses Rcpp highlights the steps involved in making a package that uses Rcpp. The document is based on the Rcpp.package.skeleton function

finally, Rcpp-modules : Exposing C++ functions and classes with Rcpp modules documents the current feature set of Rcpp modules

Rcpp modules

Rcpp modules are inspired from the Boost.Python C++ library. Rcpp modules let you expose C++ classes and functions to the R level with minimal involvment from the programmer

The feature is best described by an example (more examples on the vignette). Say we want to expose this simple class:

This would typically involve external pointers. With Rcpp modules, we can simply declare what we want to expose about this class, and Rcpp takes care of the how to expose it:

The R side consists of grabbing a reference to the module, and just use the World class

The Rcpp-modules vignette gives more details about modules, including how to use them in packages

More details about 0.8.1 release

Here is the complete extract from our NEWS file about this release

Wednesday, June 2 2010

inline 0.3.5

The inline package is an amazing, yet simple, package for R. It allows to dynamically (within the R session) define R functions and S4 methods with inlined C/C++/Fortran code.

Together with RUnit, inline powers the entire unit test suite of Rcpp.

As agreed with Oleg Sklyar, who maintains inline, we made a few additions to inline to accomodate the needs of the Rcpp family of packages.

cxxfunction

The main addition is cxxfunction which is very similar to cfunction, except that it only focuses on C++ code using the .Call calling convention. cxxfunction uses a plugin system allowing other packages to control the code that is generated before compilation, environment variables, etc ... For example, the next version of Rcpp defines an inline plugin that takes care of all the details (find the Rcpp include path, link against the Rcpp user library, etc ...)

Here is an example, from the cxxfunction help page using the Rcpp plugin (this will only work with the next version of Rcpp, because the current version does not know about this)

Here is an example using the plugin from the next version of RcppArmadillo

package.skeleton

Another addition to inline concerns the package.skeleton. We've made it S4 generic in inline and defined methods for the CFunc and CFuncList classes. In short, this allows to prototype some code using inline and quickly dump the code into a proper package

For example, here we make two functions using cxxfunction and then generate a package skeleton directly from them

Furthermore, the package.skeleton methods are aware of the plugin system, which allows plugin to have some control of additional steps involved in making the package skeleton, such as Makevars files, etc ...

getDynLib

getDynLib has been introduced in this version of inline to grab a reference to the dynamic library associated with a package, a function (CFunc object) generated by inline, or a set of functions (CFuncList object) generated by inline

Wednesday, May 19 2010

RcppArmadillo 0.2.1

Armadillo

Armadillo is a C++ linear algebra library aiming towards a good balance between speed and ease of use. Integer, floating point and complex numbers are supported, as well as a subset of trigonometric and statistics functions. Various matrix decompositions are provided through optional integration with LAPACK and ATLAS libraries.

A delayed evaluation approach is employed (during compile time) to combine several operations into one and reduce (or eliminate) the need for temporaries. This is accomplished through recursive templates and template meta-programming.

This library is useful if C++ has been decided as the language of choice (due to speed and/or integration capabilities), rather than another language like Matlab or Octave. It is distributed under a license that is useful in both open-source and commercial contexts.

Armadillo is primarily developed by Conrad Sanderson at NICTA (Australia), with contributions from around the world.

RcppArmadillo

RcppArmadillo is an R package that facilitates using Armadillo classes in R packages through Rcpp. It achieves the integration by extending Rcpp's data interchange concepts to Armadillo classes.

Example

Here is a simple implementation of a fast linear regression (provided by RcppArmadillo via the fastLm() function):

Note however that you may not want to compute a linear regression fit this way in order to protect from numerical inaccuracies on rank-deficient problems. The help page for fastLm() provides an example.

Using RcppArmadillo in other packages

RcppArmadillo is designed so that its classes can be used from other packages.

Using RcppArmadillo requires:

  • Using the header files provided by Rcpp and RcppArmadillo. This is typically achieved by adding this line in the DESCRIPTION file of the client package:

    LinkingTo : Rcpp, RcppArmadillo

    and the following line in the package code:

    #include <RcppArmadillo.h>
  • Linking against Rcpp dynamic or shared library and librairies needed by Armadillo, which is achieved by adding this line in the src/Makevars file of the client package

    PKG_LIBS = $(shell $(R_HOME)/bin/Rscript -e "Rcpp:::LdFlags()" ) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)
    

    and this line in the file src/Makevars.win:

    PKG_LIBS = $(shell Rscript.exe -e "Rcpp:::LdFlags()") $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)
    

RcppArmadillo contains a function RcppArmadillo.package.skeleton, modelled after package.skeleton from the utils package in base R, that creates a skeleton of a package using RcppArmadillo, including example code.

Quality Assurance

RcppArmadillo uses the RUnit package by Matthias Burger et al to provide unit testing. RcppArmadillo currently has 19 unit tests (called from 8 unit test functions).

Source code for unit test functions are stored in the unitTests directory of the installed package and the results are collected in the RcppArmadillo-unitTests vignette.

We run unit tests before sending the package to CRAN on as many systems as possible, including Mac OSX (Snow Leopard), Debian, Ubuntu, Fedora 12 (64bit), Win 32 and Win64.

Unit tests can also be run from the installed package by executing

RcppArmadillo:::test()

where an output directory can be provided as an optional first argument.

Links

Support

Questions about RcppArmadillo should be directed to the Rcpp-devel mailing list at https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Questions about Armadillo itself should be directed to its forum http://sourceforge.net/apps/phpbb/arma/

 -- Romain Francois, Montpellier, France
    Dirk Eddelbuettel, Chicago, IL, USA
    Doug Bates, Madison, WI, USA

    May 2010

Wednesday, May 12 2010

Rcpp 0.8.0

Summary

Version 0.8.0 of the Rcpp package was released to CRAN today. This release marks another milestone in the ongoing redesign of the package, and underlying C++ library.

Overview

Rcpp is an R package and C++ library that facilitates integration of C++ code in R packages.

The package features a set of C++ classes (Rcpp::IntegerVector, Rcpp::Function, Rcpp::Environment, ...) that makes it easier to manipulate R objects of matching types (integer vectors, functions, environments, etc ...).

Rcpp takes advantage of C++ language features such as the explicit constructor/destructor lifecycle of objects to manage garbage collection automatically and transparently. We believe this is a major improvement over PROTECT/UNPROTECT. When an Rcpp object is created, it protects the underlying SEXP so that the garbage collector does not attempt to reclaim the memory. This protection is withdrawn when the object goes out of scope. Moreover, users generally do not need to manage memory directly (via calls to new / delete or malloc / free) as this is done by the Rcpp classes or the corresponding STL containers.

API

Rcpp provides two APIs: an older set of classes we refer to the classic API (see below for the section 'Backwards Compatibility) as well as second and newer set of classes.

Classes of the new Rcpp API belong to the Rcpp namespace. Each class is associated to a given SEXP type and exposes an interface that allows manipulation of the object that may feel more natural than the usual use of macros and functions provided by the R API.

SEXP type Rcpp class
INTSXP Rcpp::IntegerVector
REALSXP Rcpp::NumericVector
RAWSXP Rcpp::RawVector
LGLSXP Rcpp::LogicalVector
CPLXSXP Rcpp::ComplexVector
STRSXP Rcpp::CharacterVector
VECSXP Rcpp::List
EXPRSXP Rcpp::ExpressionVector
ENVSXP Rcpp::Environment
SYMSXP Rcpp::Symbol
CLOSXP
BUILTINSXP Rcpp::Function
SPECIALSXP
LANGSXP Rcpp::Language
LISTSXP Rcpp::Pairlist
S4SXP Rcpp::S4
PROMSXP Rcpp::Promise
WEAKREFSXP Rcpp::WeakReference
EXTPTRSXP template < typename T> Rcpp::XPtr    

Some SEXP types do not have dedicated Rcpp classes : NILSXP, DOTSXP, ANYSXP, BCODESXP and CHARSXP.

Still missing are a few convenience classes such as Rcpp::Date or Rcpp::Datetime which would map useful and frequently used R data types, but which do not have an underlying SEXP type.

Data Interchange

Data interchange between R and C++ is managed by extensible and powerful yet simple mechanisms.

Conversion of a C++ object is managed by the template function Rcpp::wrap. This function currently manages :

  • primitive types : int, double, bool, float, Rbyte, ...
  • std::string, const char*
  • STL containers such as std::vector<T> and STL maps such as std::mapr< std::string, Tr> provided that the template type T is wrappable
  • any class that can be implicitely converted to SEXP, through operator SEXP()

Conversion of an R object to a C++ object is managed by the Rcpp::as<T> template which can handle:

  • primitive types
  • std::string, const char*
  • STL containers such as std::vector<T>

Rcpp::wrap and Rcpp::as are often used implicitely. For example, when assigning objects to an environment:

  // grab the global environment
  Rcpp::Environment global = Rcpp::Environment::global_env() ;
  std::deque z( 3 ); z[0] = false; z[1] = true; z[3] = false ;

  global["x"] = 2 ;                    // implicit call of wrap
  global["y"] = "foo";                 // implicit call of wrap
  global["z"] = z ;                    // impl. call of wrap>

  int x = global["x"] ;                // implicit call of as
  std::string y = global["y"]          // implicit call of as
  std::vector z1 = global["z"] ; // impl. call of as>

Rcpp contains several examples that illustrate wrap and as. The mechanism was designed to be extensible. We have developped separate packages to illustrate how to extend Rcpp conversion mechanisms to third party types.

  • RcppArmadillo : conversion of types from the Armadillo C++ library.
  • RcppGSL : conversion of types from the GNU Scientific Library.

Rcpp is also used for data interchange by the RInside package which provides and easy way of embedding an R instance inside of C++ programs.

inline use

Rcpp depends on the inline package by Oleg Sklyar et al. Rcpp then uses the 'cfunction' provided by inline (with argument Rcpp=TRUE) to compile, link and load C++ function from the R session.

As of version 0.8.0 of Rcpp, we also define an R function cppfunction that acts as a facade function to the inline::cfuntion, with specialization for C++ use.

This allows quick prototyping of compiled code. All our unit tests are based on cppfunction and can serve as examples of how to use the mechanism. For example this function (from the runit.GenericVector.R unit test file) defines from R a C++ (simplified) version of lapply:

  ## create a compiled function cpp_lapply using cppfunction 
  cpp_lapply <- cppfunction(signature(x = "list", g = "function" ), 
  		'Function fun(g) ;
		 List input(x) ;
		 List output( input.size() ) ;
		 std::transform( input.begin(), input.end(), output.begin(), fun ) ;
		 output.names() = input.names() ;
		 return output ;
	    ')
  ## call cpp_lapply on the iris data with the R function summary
  cpp_lapply( iris, summary )	

Using Rcpp in other packages

Rcpp is designed so that its classes are used from other packages. Using Rcpp requires :

  • using the header files provided by Rcpp. This is typically done by adding this line in the package DESRIPTION file:
    	LinkingTo: Rcpp
    
    and add the following line in the package code:
    	#include <Rcpp.h>
    
  • linking against the Rcpp dynamic or static library, which is achieved by adding this line to the src/Makevars of the package:
    	PKG_LIBS = $(shell $(R_HOME)/bin/Rscript -e "Rcpp:::LdFlags()" )
    
    and this line to the src/Makevars.win file:
    	PKG_LIBS = $(shell Rscript.exe -e "Rcpp:::LdFlags()")
    

Rcpp contains a function Rcpp.package.skeleton, modelled after package.skeleton from the utils package in base r, that creates a skeleton of a package using Rcpp, including example code.

C++ exceptions

C++ exceptions are R contexts are both based on non local jumps (at least on the implementation of exceptions in gcc), so care must be ensure that one system does not void assumptions of the other. It is therefore very strongly recommended that each function using C++ catches C++ exceptions. Rcpp offers the function forward_exception_to_r to facilitate forwarding the exception to the "R side" as an R condition. For example :

  SEXP foo( ) {
    try {
      // user code here
    } catch( std::exception& __ex__){
      forward_exception_to_r( __ex__ ) ;
    }
    // return something here
  }

Alternatively, functions can enclose the user code with the macros BEGIN_RCPP and END_RCPP, which provides for a more compact way of programming. The function above could be written as follows using the macros:

  SEXP foo( ) {
    BEGIN_RCPP
    // user code here
    END_RCPP
    // return something here
  }

The use of BEGIN_RCPP and END_RCPP is recommended to anticipate future changes of Rcpp. We might for example decide to install dedicated handlers for specific exceptions later.

Experimental code generation macros

Rcpp contains several macros that can generate repetitive 'boiler plate' code:

  RCPP_FUNCTION_0, ..., RCPP_FUNCTION_65
  RCPP_FUNCTION_VOID_0, ..., RCPP_FUNCTION_VOID_65
  RCPP_XP_METHOD_0, ..., RCPP_XP_METHOD_65
  RCPP_XP_METHOD_CAST_0, ..., RCPP_XP_METHOD_CAST_65
  RCPP_XP_METHOD_VOID_0, ..., RCPP_XP_METHOD_VOID_65

For example:

  RCPP_FUNCTION_2( int, foobar, int x, int y){
     return x + y ;
  }

This will create a .Call compatible function "foobar" that calls a c++ function for which we provide the argument list (int x, int y) and the return type (int). The macro also encloses the call in BEGIN_RCPP/END_RCPP so that exceptions are properly forwarded to R.

Examples of the other macros are given in the NEWS file.

This feature is still experimental, but is being used in packages highlight and RProtoBuf

Quality Assurance

Rcpp uses the RUnit package by Matthias Burger et al and the aforementioned inline package by Oleg Sklyar et al to provide unit testing. Rcpp currently has over 500 unit tests (called from more than 230 unit test functions) with very good coverage of the critical parts of the package and library.

Source code for unit test functions are stored in the unitTests directory of the installed package and the results are collected in the "Rcpp-unitTests" vignette.

The unit tests can be both during the standard R package build and testing process, and also when the package is installed. The latter use is helpful to ensure that no system components have changed in a way that affect the Rcpp package since it has been installed. To run the tests, execute

   Rcpp:::test()

where an output directory can be provided as an optional first argument.

Backwards Compatibility

We believe the new API is now more complete and useful than the previous set of classes, which we refer to as the "classic Rcpp API". We would therefore recommend to package authors using 'classic' Rcpp to move to the new API. However, the classic API is still maintained and will continue to be maintained to ensure backwards compatibility for code that uses it.

Packages uses the 'Classic API' can use features of the new API selectively and in incremental steps. This provides for a non-disruptive upgrade path.

Documentation

The package contains a vignette which provides a short and succinct introduction to the Rcpp package along with several motivating examples. Also provided is a vignette containing the regression test summary from the time the package was built.

Links

Support

Questions about Rcpp should be directed to the Rcpp-devel mailing list

 -- Dirk Eddelbuettel and Romain Francois
    Chicago, IL, USA, and Montpellier, France
	May 2010

Sunday, February 14 2010

Rcpp 0.7.7

A good 2 days after 0.7.6 was released, here comes Rcpp 0.7.7. The reason for this release is that a subtle bug installed itself and we did not catch it in time

The new version also includes two new class templates : unary_call and binary_call that help integration of calls (e.g. Rcpp::Language objects) with STL algorithms. For example here is how we might use unary_call

This emulates the code

> lapply( 1:10, function(n) seq(from=n, to = 0 ) )

As usual, more examples in the unit tests

Saturday, February 13 2010

highlight 0.1-5

I've pushed the version 0.1-5 of highlight to CRAN, it should be available in a couple of days.

This version fixes highlighting of code when one wants to display the prompt and the continue prompt. For example, this code :

rnorm(10, 
	mean = 5)


runif(5)

gets highlighted like this:

using this code:

> highlight( "/tmp/test.R", renderer=renderer_html(document=T), showPrompts = TRUE, output = "test.html" )

Under the hood, highlight now depends on Rcpp and uses some of the C++ classes of the new Rcpp API. See the get_highlighted_text function in the code.

Rcpp 0.7.6

Rcpp 0.7.6 was released yesterday. This is mostly a maintenance update since the version 0.7.5 had some very minor issues on windows, but we still managed however to include some new things as well.

Vectors can now use name based indexing. This is typically useful for things like data frame, which really are named lists. Here is an example from our unit tests where we grab a column from a data frame and then compute the sum of its values:

The classes CharacterVector, GenericVector(aka List) and ExpressionVector now have iterators. Below is another example from our unit tests, where we use iterators to implement a C++ version of lapply using the std::transform algorithm from the STL.

Generic vectors (lists) gain some methods that make them look more like std::vector from the STL : push_back, push_front, insert and erase. Examples of using these methods are available in our unit tests:

> system.file( "unitTests", "runit.GenericVector.R", 
+ package = "Rcpp" )

Tuesday, February 9 2010

Rcpp 0.7.5

Dirk released Rcpp 0.7.5 yesterday

The main thing is the smarter wrap function that now uses techniques of type traits and template meta-programming to have a compile time guess at whether an object is wrappable, and how to do it. Currently wrappable types are :

  • primitive types : int, double, Rbyte, Rcomplex
  • std::string
  • STL containers such as std::vector<T> as long as T is wrappable. This is not strictly tied to the STL, actually any type that has a nested type called iterator and member functions begin() and end() will do
  • STL maps keyed by strings such as std::map<std::string,T> as long as T is wrappable
  • any class that can be implicitely converted to SEXP
  • any class for which the wrap template is partly or fully specialized. (The next version of RInside has an example of that)

Here comes an example (from our unit tests) :

        funx <- cfunction(signature(), 
        '
        std::map< std::string,std::vector<int> > m ;
        std::vector<int> b ; b.push_back(1) ; b.push_back(2) ; m["b"] = b ;
        std::vector<int> a ; a.push_back(1) ; a.push_back(2) ; a.push_back(2) ; m["a"] = a ;
        std::vector<int> c ; c.push_back(1) ; c.push_back(2) ; c.push_back(2) ; c.push_back(2) ; m["c"] = c ;
        return wrap(m) ;
        ', 
        Rcpp=TRUE, verbose=FALSE, includes = "using namespace Rcpp;" )
R> funx()
$a
[1] 1 2 2

$b
[1] 1 2

$c
[1] 1 2 2 2

Apart from that, other things have changed, here is the relevant section of the NEWS for this release

    o 	wrap has been much improved. wrappable types now are :
    	- primitive types : int, double, Rbyte, Rcomplex, float, bool
    	- std::string
    	- STL containers which have iterators over wrappable types:
    	  (e.g. std::vector, std::deque, std::list, etc ...). 
    	- STL maps keyed by std::string, e.g std::map
    	- classes that have implicit conversion to SEXP
    	- classes for which the wrap template if fully or partly specialized
    	This allows composition, so for example this class is wrappable: 
    	std::vector< std::map > (if T is wrappable)
    	
    o 	The range based version of wrap is now exposed at the Rcpp::
    	level with the following interface : 
    	Rcpp::wrap( InputIterator first, InputIterator last )
    	This is dispatched internally to the most appropriate implementation
    	using traits

    o	a new namespace Rcpp::traits has been added to host the various
    	type traits used by wrap

    o 	The doxygen documentation now shows the examples

    o 	A new file inst/THANKS acknowledges the kind help we got from others

    o	The RcppSexp has been removed from the library.
    
    o 	The methods RObject::asFoo are deprecated and will be removed
    	in the next version. The alternative is to use as.

    o	The method RObject::slot can now be used to get or set the 
    	associated slot. This is one more example of the proxy pattern
    	
    o	Rcpp::VectorBase gains a names() method that allows getting/setting
    	the names of a vector. This is yet another example of the 
    	proxy pattern.
    	
    o	Rcpp::DottedPair gains templated operator<< and operator>> that 
    	allow wrap and push_back or wrap and push_front of an object
    	
    o	Rcpp::DottedPair, Rcpp::Language, Rcpp::Pairlist are less
    	dependent on C++0x features. They gain constructors with up
    	to 5 templated arguments. 5 was choosed arbitrarily and might 
    	be updated upon request.
    	
    o	function calls by the Rcpp::Function class is less dependent
    	on C++0x. It is now possible to call a function with up to 
    	5 templated arguments (candidate for implicit wrap)
    	
    o	added support for 64-bit Windows (thanks to Brian Ripley and Uwe Ligges)

Wednesday, January 13 2010

Rcpp 0.7.2

Rcpp 0.7.2 is out, checkout Dirk's blog for details

selected highlights from this new version:

character vectors

if one wants to mimic this R code in C

> x <- c( "foo", "bar" )
one ends up with this :
SEXP x = PROTECT( allocVector( STRSXP, 2) ) ;
SET_STRING_ELT( x, 0, mkChar( "foo" ) ) ;
SET_STRING_ELT( x, 1, mkChar( "bar" ) ) ;
UNPROTECT(1) ;
return x ;

Rcpp lets you express the same like this :

CharacterVector x(2) ;
x[0] = "foo" ; 
x[1] = "bar" ;

or like this if you have GCC 4.4 :

CharacterVector x = { "foo", "bar" } ;

environments, functions, ...

Now, we try to mimic this R code in C :
rnorm( 10L, sd = 100 )
You can do one of these two ways in Rcpp :
Environment stats("package:stats") ;
Function rnorm = stats.get( "rnorm" ) ;
return rnorm( 10, Named("sd", 100 ) ) ;

or :

Language call( "rnorm", 10, Named("sd", 100 ) ) ;
return eval( call, R_GlobalEnv ) ;

and it will get better with the next release, where you will be able to just call call.eval() and stats["rnorm"].

Using the regular R API, you'd write these liks this :

SEXP stats = PROTECT( R_FindNamespace( mkString("stats") ) ) ;
SEXP rnorm = PROTECT( findVarInFrame( stats, install("rnorm") ) ) ;
SEXP call  = PROTECT( LCONS( rnorm, CONS(ScalarInteger(10), CONS(ScalarReal(100.0), R_NilValue)))) ;
SET_TAG( CDDR(call), install("sd") ) ;
SEXP res = PROTECT( eval( call, R_GlobalEnv ) );
UNPROTECT(4) ;
return res ;

or :

SEXP call  = PROTECT( LCONS( install("rnorm"), CONS(ScalarInteger(10), CONS(ScalarReal(100.0), R_NilValue)))) ;
SET_TAG( CDDR(call), install("sd") ) ;
SEXP res = PROTECT( eval( call, R_GlobalEnv ) );
UNPROTECT(2) ;
return res ;

Friday, January 8 2010

External pointers with Rcpp

One of the new features of Rcpp is the XPtr class template, which lets you treat an R external pointer as a regular pointer. For more information on external pointers, see Writing R extensions.

To use them, first we need a pointer to some C++ data structure, we'll use a pointer to a vector<int> :

/* creating a pointer to a vector<int> */
std::vector<int>* v = new std::vector<int> ;
v->push_back( 1 ) ;
v->push_back( 2 ) ;

Then, using the XPtr template class we wrap the pointer in an R external pointer

/* wrap the pointer as an external pointer */
/* this automatically protected the external pointer from R garbage 
   collection until p goes out of scope. */
Rcpp::XPtr< std::vector<int> > p(v, true) ;

The first parameter of the constructor is the actual (sometimes called dumb) pointer, and the second parameter is a flag indicating that we need to register a delete finalizer with the external pointer. When the external pointer goes out of scope, it becomes subject to garbage collection, and when it is garbage collected, the finalizer is called, which then calls delete on the dumb pointer.

Wrapping it all together thanks to the inline package, here's a function that creates an external pointer to a vector<int> and return it to R

        funx <- cfunction(signature(), '
                /* creating a pointer to a vector<int> */
                std::vector<int>* v = new std::vector<int> ;
                v->push_back( 1 ) ;
                v->push_back( 2 ) ;
                
                /* wrap the pointer as an external pointer */
                /* this automatically protected the external pointer from R garbage 
                   collection until p goes out of scope. */
                Rcpp::XPtr< std::vector<int> > p(v, true) ;
                
                /* return it back to R, since p goes out of scope after the return 
                   the external pointer is no more protected by p, but it gets 
                   protected by being on the R side */
                return( p ) ;
        ', Rcpp=TRUE, verbose=FALSE)
        xp <- funx()

At that point, xp is an external pointer object

> xp
<pointer: 0x9c850c8>
> typeof( xp )
[1] "externalptr"

Then, we can pass it back to the C(++) layer, an continue to work with the wrapped stl vector of ints. For this we use the other constructor for the XPtr class template, that takes an R object (SEXP) of sexp type EXTPTRSXP.


/* wrap the SEXP as a smart external pointer */
Rcpp::XPtr< std::vector<int> > p(x) ;

/* use p as a 'dumb' pointer */
p->front() ;

Again, we can wrap this up for quick prototyping using the inline package :

        # passing the pointer back to C++
        funx <- cfunction(signature(x = "externalptr" ), '
                /* wrapping x as smart external pointer */
                /* The SEXP based constructor does not protect the SEXP from 
                   garbage collection automatically, it is already protected 
                   because it comes from the R side, however if you want to keep 
                   the Rcpp::XPtr object on the C(++) side
                   and return something else to R, you need to protect the external
                   pointer, by using the protect member function */
                Rcpp::XPtr< std::vector<int> > p(x) ;
                
                /* just return the front of the vector as a SEXP */
                return( Rcpp::wrap( p->front() ) ) ;
        ', Rcpp=TRUE, verbose=FALSE)
        front <- funx(xp)
> front
[1] 1

The example is extracted from one unit tests that we use in Rcpp, see the full example :

> system.file( "unitTests", "runit.XPTr.R", package = "Rcpp" )
[1] "/usr/local/lib/R/library/Rcpp/unitTests/runit.XPTr.R"

See also the announcement for the release of Rcpp 0.7.1 here to get a list of new features, or wait a few days to see version 0.7.2.

Using the XPtr class template is the bread and butter of the CPP package I blogged about here

Tuesday, December 29 2009

C++ exceptions at the R level

The feature described in this post is no longer valid with recent versions of Rcpp. Setting a terminate handler does not work reliably on windows, so we don't do it at all anymore. Exceptions need to be caught and relayed to R. Bracketing the code with BEGIN_RCPP / END_RCPP does it simply. See the Rcpp-introduction vignette for details.

I've recently offered an extra set of hands to Dirk to work on the Rcpp package, this serves a good excuse to learn more about C++

Exception management was quite high on my list. C++ has nice exception handling (well not as nice as java, but nicer than C).

With previous versions of Rcpp, the idiom was to wrap up everything in a try/catch block and within the catch block, call the Rf_error function to send an R error, equivalent of calling stop. Now things have changed and, believe it or not, you can now catch a C++ exception at the R level, using the standard tryCatch mechanism

, so for example when you throw a C++ exception (inheriting from the class std::exception) at the C++ level, and the exception is not picked up by the C++ code, it automatically sends an R condition that contain the message of the exception (what the what member function of std::exception gives) as well as the class of the exception (including namespace)

This, combined with the new inline support for Rcpp, allows to run this code, (also available in the inst/examples/RcppInline directory of Rcpp)

require(Rcpp)
require(inline)
funx <- cfunction(signature(), '
throw std::range_error("boom") ;
return R_NilValue ;
', Rcpp=TRUE, verbose=FALSE)

Here, we create the funx "function" that compiles itself into a C++ function and gets dynamically linked into R (thanks to the inline package). The relevant thing (at least for this post) is the throw statement. We throw a C++ exception of class "std::range_error" with the message "boom", and what follows shows how to catch it at the R level:

tryCatch(  funx(), "C++Error" = function(e){
    cat( sprintf( "C++ exception of class '%s' : %s\n", class(e)[1L], e$message  ) )
} )
# or using a direct handler 
tryCatch(  funx(), "std::range_error" = function(e){
        cat( sprintf( "C++ exception of class '%s' : %s\n", class(e)[1L], e$message  ) )
} )

... et voila

Under the carpet, the abi unmangling namespace is at work, and the function that grabs the uncaught exceptions is much inspired from the verbose terminate handler that comes with the GCC

Part of this was inspired from the new java exception handling that came with the version 0.8-0 of rJava, but cooked with C++ ingredients

Tuesday, December 22 2009

CPP package: exposing C++ objects

I've just started working on the new package CPP, as usual the project is maintained in r-forge. The package aims at exposing C++ classes at the R level, starting from classes from the c++ standard template library.

key to the package is the CPP function (much inspired from the J function of rJava). The CPP function builds an S4 object of class "C++Class". The "C++Class" currently is a placeholder wrapping the C++ class name, and defines the new method (again this trick or making new S4 generic comes from rJava). For example to create an R object that wraps up a std::vector<int>, one would go like this:

x <- new( CPP( "vector<int>" ) )

This is no magic and don't expect to be able to send anything to CPP (C++ does not have reflection capabilities), currently only these classes are defined : std::vector<int>, vector<double>, vector<raw> and set<int>

Because C++ does not offer reflection capabilities, we have to do something else to be able to invoke methods on the wrapped objects. Currently the approach that the package follows is a naming convention. The $ method create the name of the C routine it wants to call based on the C++ class the object is wrapping, the name of the method, and the types of the input parameters. So for example calling the size method for a vector<:int> object yields this routine name: "vector_int____size", calling the push_back method of the vector<double> class, passing an integer vector as the first parameter yields this signature : "vector_double____push_back___integer" .... (the CPP:::getRoutineSignature implements the convention)

Here is a full example using the set<int> class. Sets are a good example of a data structure that is not available in R. Basically it keeps its objects sorted

> # create the object
> x <- new( CPP("set<int>") )
> # insert data using the insert method
> # see : insert
> x$insert( sample( 1:20 ) )
> # ask for the size of the set
> x$size()
[1] 20
> # bring it back as an R classic integer vector
> as.vector( x )
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

Currently the package is my excuse to learn about the standard template library, and it is quite possible that the functionality will be merged into the Rcpp it currently depends on. Because of this volatility, I'll use the Rcpp-devel mailing list instead of creating a new one.