Romain Francois, Professional R Enthusiast

To content | To menu | To search

Sunday, March 24 2013



This blog is moving to The new one is powered by wordpress and gets a subdomain of

See you there

Monday, February 18 2013

Improving the graph gallery

I'm trying to make improvements to the R Graph Gallery, I'm looking for suggestions from users of the website.

I've started a question on the website's facebook page. Please take a few seconds to vote to existing improvements possibilities and perhaps offer some of your own ideas.


Monday, November 5 2012

OOP with Rcpp modules

The purpose of Rcpp modules has always been to make it easy to expose C++ functions and classes to R. Up to now, Rcpp modules did not have a way to declare inheritance between C++ classes. This is now fixed in the development version, and the next version of Rcpp will have a simple mechanism to declare inheritance.

Consider this simple example, we have a base class Shape with two virtual methods (area and contains) and two classes Circle and Rectangle) each deriving from Shape and representing a specific shape.

The classes might look like this:

And we can expose these classes to R using the following module declarative code:

It is worth noticing that:

  • The area and contains methods are exposed as part of the base Shape class
  • Classes Rectangle and Circle simply declare that they derive from Shape using the derives notation.

R code that uses these classes looks like this:


Thursday, October 25 2012

Rcpp modules more flexible

Rcpp modules just got more flexible (as of revision 3838 of Rcpp, to become 0.9.16 in the future).

modules have allowed exposing C++ classes for some time now, but developpers had to declare custom wrap and as specializations if they wanted their classes to be used as return type or argument type of a C++ function or method. This led to writing boilerplate code. The newest devel version allows for syntax like this:

The only thing the developper has to do is to declare the class using the macro RCPP_EXPOSED_CLASS. This will declare the appropriate class traits that Rcpp is using for internal implementations of as and wrap

One the example we can see three examples of the new functionality:

  • make_foo : this returns a Foo
  • cloner: this returns a Foo*
  • bla: uses a const Foo& as argument

Sunday, January 15 2012

Crawling facebook with R

So, let's crawl some data out of facebook using R. Don't get too excited though, this is just a weekend whatif project. Anyway, so for example, I want to download some photos where I'm tagged.

First, we need an access token from facebook. I don't know how to get this programmatically, so let's get one manually, log on to facebook and then go to the Graph API Explorer


Grab the access token and save it into a variable in R

access_token <- "************..."

Now we need to study the graph api to figure out the url we need to build to do what we want to do, e.g. here we want "me/photos". I've wrapped this into an R function:

And then we can use it:

That's it, I told you it was not that exciting, but it was still worth playing with ...


Wednesday, December 14 2011

... And now for solution 17, still using Rcpp

Here comes yet another sequel of the code optimization problem from the R wiki, still using Rcpp, but with a different strategy this time

Essentially, my previous version (15) was using stringstream although we don't really need its functionality and it was slowing us down

Also, the characters "i" and "." are always on the same position so we can assign them once and for all

So without further ado, here is attempt 17:

With quite a speedup from attempt 15:

                test replications elapsed relative
2 generateIndex17(n)           20   9.363 1.000000
1 generateIndex15(n)           20  17.795 1.900566

Saturday, November 26 2011

int64: 64 bit integer vectors for R


The Google Open Source Programs Office sponsored me to create the new int64 package that has been released to CRAN a few days ago. The package has been mentionned in an article in the open source blog from Google.

The package defines classes int64 and uint64 that represent signed and unsigned 64 bit integer vectors. The package also allows conversion of several types (integer, numeric, character, logical) to 64 bit integer vectors, arithmetic operations as well as other standard group generic functions, and reading 64 bit integer vectors as a data.frame column using int64 or uint64 as the colClasses argument.

The package has a vignette that details its features, several examples are given in the usual help files. Once again, I've used RUnit for quality insurance about the package code

int64 has been developped so that 64 bit integer vectors are represented using only R data structures, i.e data is not represented as external pointers to some C++ object. Instead, each 64 bit integer is represented as a couple of regular 32 bit integers, each of them carrying half the bits of the underlying 64 bit integer. This was a choice by design so that 64 bit integer vectors can be serialized and used as data frame columns.

The package contains C++ headers that third party packages can used (via LinkingTo: int64) to use the C++ internals. This allows creation and manipulation of the objects in C++. The internals will be documented in another vignette for package developpers who wish to use the internals. For the moment, the main entry point is the C++ template class LongVector.

I'm particularly proud that Google trusted me to sponsor the development of int64. The next versions of packages Rcpp and RProtoBuf take advantage of the facilities of int64, e.g. Rcpp gains wrapping of C++ containers of 64 bit integers as R objects of classes int64 and uint64 and RProtoBuf improves handling of 64 bit integers in protobuf messages. More on this later

Thursday, November 10 2011

Code optimization, an Rcpp solution

Tony Breyal woke up an old code optimization problem in this blog post, so I figured it was time for an Rcpp based solution

This solutions moves down Henrik Bengtsson's idea (which was at the basis of attempt 10) down to C++. The idea was to call sprintf less than the other solutions to generate the strings "001", "002", "003", ...

We can benchmark this version using the rbenchmark package:

> library(rbenchmark)
> n <- 2000
> benchmark(
+     generateIndex10(n), 
+     generateIndex11(n),
+     generateIndex12(n), 
+     generateIndex13(n),
+     generateIndex14(n),
+     columns = 
+        c("test", "replications", "elapsed", "relative"),
+     order = "relative",
+     replications = 20
+ )
                test replications elapsed relative
5 generateIndex14(n)           20  21.015 1.000000
3 generateIndex12(n)           20  22.034 1.048489
4 generateIndex13(n)           20  23.436 1.115203
2 generateIndex11(n)           20  23.829 1.133904
1 generateIndex10(n)           20  30.580 1.455151

Sunday, October 30 2011

Rcpp reverse dependency graph

I played around with reverse dependencies of Rcpp. At the moment, 44 packages depend on Rcpp and the number goes up to 53 when counting recusive reverse dependencies.

I've used graphviz for the representation of the directed graph


Here is the code I've used to generate the dot file:

Tuesday, October 11 2011

R Bloggers widget in R Graph Gallery

Following last post about partnership with R Bloggers, Tal and I have added a small widget to the gallery main page to present links to recent posts on R Bloggers


It uses the wordpress api to grab information about the rss feed generated by R Bloggers and displays links one at a time using the same jquery magic as we've used in the widget that was integrated in R Bloggers a few days ago

Saturday, October 8 2011

R Graph Gallery widget in R Bloggers

The R Bloggers website, maintained by Tal Galili, aggregates blogs (including mine) from many people of the R community.

Tal and I have been wondering about how to tight R Bloggers with the gallery, supporting each other's website. To that extent, I've made a quick and dirty widget, using the jquery cycle plugin that is now on the right sidebar of R bloggers, inside the related sites box.


The widget first chooses 20 items from the gallery at random, and then cycles through them.

This is an initial design made specifically for R Bloggers, but it is quite likely that I will improve on this and make the widget more generic so that other website can use it to advertise for the gallery.

Monday, October 3 2011

Twitter updates on R Graph Gallery

I've added a twitter search widget that searches for the #rgraphgallery hashtag or the url of the gallery on the front page.


Friday, September 30 2011

R Graph Gallery - Donations Welcome

I've added a PayPal button into the graph, just in case people want to help the development of the website


Thursday, September 22 2011

Facebook page about the Graph Gallery

I've just created a facebook page about the R Graph Gallery

I hope this will improve the experience of the website by making it more social, for example, I anticipate that people will share their own graphs by sending a picture on the facebook page wall

As part of this, I've added the usual "find us on facebook" widget on the home page of the gallery


Wednesday, September 21 2011

More facebook and google plus on the Graph Gallery

Following up on yesterday's post about facebook like box, I've added some more social things into the gallery. The main page gains a google plus "plus one" button, and each graph page now has a +1 button, a facebook like button, and a facebook comment box


Tuesday, September 20 2011

Facebook like button in Graph Gallery

I've added facebook like button in the home page of the R Graph Gallery and on each image page, i.e. this one which I "like".


Friday, April 29 2011

Rcpp Workshop slides

Dirk and I gave a full day Rcpp workshop yesterday in Chicago before the R in Finance conference.

The pdfs of the slides are available here: part 1 (intro), part 2 (details), part 3 (modules) and part 4 (applications)

Sunday, April 17 2011

Rcpp article in JSS

The Journal of Statistical Software published our Rcpp article

Wednesday, March 30 2011

Rcpp workshop in Chicago on April 28th


This year's R/Finance conference will be preceded by a full-day masterclass on Rcpp and related topics which will be held on Thursday, April 28, 2011, the Univ. of Illinois at Chicago campus.

Join Dirk Eddelbuettel and Romain Francois for six hours of detailed and hands-on instructions and discussions around Rcpp, inline, RInside, RcppArmadillo and other packages---in intimate small-group setting.

The full-day format allows to combine a morning introductory session with a more advanced afternoon session while leaving room for sufficient breaks. There will be about six hours of instructions, a one-hour lunch break and two half-hour coffee breaks.

Morning session: "A hands-on introduction to R and C++"

The morning session will provide a practical introduction to the Rcpp package (and other related packages). The focus will be on simple and straightforward applications of Rcpp in order to extend R and/or to significantly accelerate the execution of simple functions.

The tutorial will cover the inline package which permits embedding of self-contained C, C++ or Fortran code in R scripts. We will also discuss RInside to embed R code in C++ applications, as well as standard Rcpp extension packages such as RcppArmadillo for linear algebra and RcppGSL.

Afternoon session: "Advanced R and C++ topics"

This afternoon tutorial will provide a hands-on introduction to more advanced Rcpp features. It will cover topics such as writing packages that use Rcpp, how 'Rcpp modules' and the new R ReferenceClasses interact, and how 'Rcpp sugar' lets us write C++ code that is often as expressive as R code. Another possible topic, time permitting, may be writing glue code to extend Rcpp to other C++ projects.

We also hope to leave some time to discuss problems brought by the class participants.


Knowledge of R as well as general programming knowledge; C or C++ knowledge is helpful but not required.

Users should bring a laptop set up so that R packages can be built. That means on Windows, Rtools needs to be present and working, and on OS X the Xcode package should be installed.


Registration is available via the R/Finance conference at

or directly at RegOnline

The cost is USD 500 for the whole day, and space will be limited.


Please contact us directly at

Friday, December 3 2010

Evolution of Rcpp code size

I've been contributing to Rcpp for about a year now, initially to add missing bits that were needed for the development of RProtoBuf. This led to a complete redesign of the API, which now goes way beyond the initial code (that we now call classic Rcpp API). This has been quite a journey in terms of development with more than 1500 commits to the svn repository of the project on R-forge, and promotion with presentations at RMetrics 2010, useR 2010, LondonR and at Google, as well as many blog posts about Rcpp and the packages that derive from it.

I wanted to take this opportunity to express visually how vibrant the development of Rcpp has been since it was first relaunched in 2008, and since I started to contribute.

The graph below shows the evolution of the number of lines (counting the .h, .cpp, .R, .Rd, .Rnw files) accross released versions of the Rcpp package on CRAN

The first thing I need for this is to download the 32 versions of Rcpp that have been released since 0.6.0.

Then, all it takes is some processing with R to extract the relevant information (number of lines in files of interest), and present the data in a graph. I'm also taking this opportunity to have some fun with raster images and the png package


The code explosion that started around version 0.7.8 marks the beginning of development of two of the most exciting and addictive projects I ever worked on: modules and sugar

The acceleration between 0.8.8 and the current version 0.8.9 represents many of the improvements that were made in modules. That alone, with more than 8000 new lines of code and documentation represents about 4 times as many lines as the total number of lines in 0.6.0

We still have plenty of ideas, and Rcpp will continue to evolve to deliver a quality interface between R and C++, to the best of the current team's abilities.

The full code is available below:

- page 1 of 5