Romain Francois, Professional R Enthusiast

To content | To menu | To search

Tag - python

Entries feed - Comments feed

Friday, January 23 2009

R wrapper in open turns

This is an attempt to create a wrapper for openturns using R. This is based on the wrapper template called wrapper_calling_shell_command available with openturns and somewhat inspired from the scilab example. Wrappers allow you to call an external program as the function through which you propagate uncertainty with openturns, so that you can write you function in the language you are familiar with (R here) but still take advantage of open turns. This was done in fedora with R and open turns installed (see this post for how to install open turns on a fedora 10 machine).
The first thing we need to do is to grab the template from the installed open turns.
$ mkdir ~/opwrappers
$ cp -fr /usr/local/share/openturns/WrapperTemplates/wrapper_calling_shell_command ~/opwrappers/rwrapper
$ cd ~/opwrappers/rwrapper/
$ ll
total 300
-rw-r--r-- 1 romain romain     27 2009-01-23 11:54 AUTHORS
-rwxr-xr-x 1 romain romain   1304 2009-01-23 11:54 bootstrap
-rw-r--r-- 1 romain romain 199260 2009-01-23 11:54 ChangeLog
-rw-r--r-- 1 romain romain    216 2009-01-23 11:54 code_C1.data
-rw-rw-r-- 1 romain romain   1594 2009-01-23 12:42 configure.ac
-rw-r--r-- 1 romain romain  18002 2009-01-23 11:54 COPYING
-rwxr-xr-x 1 romain romain   1794 2009-01-23 11:54 customize
-rw-r--r-- 1 romain romain   9498 2009-01-23 11:54 INSTALL
drwxr-xr-x 2 romain romain   4096 2009-01-23 11:54 m4
-rw-rw-r-- 1 romain romain    571 2009-01-23 12:42 Makefile.am
-rw-r--r-- 1 romain romain    447 2009-01-23 11:54 myCFunction.c
-rw-r--r-- 1 romain romain    455 2009-01-23 11:54 myCFunction.h
-rw-r--r-- 1 romain romain      0 2009-01-23 11:54 NEWS
-rw-r--r-- 1 romain romain    925 2009-01-23 11:54 README
-rwxrwxr-x 1 romain romain    435 2009-01-23 12:03 rwrapper.R
-rw-rw-r-- 1 romain romain   3722 2009-01-23 12:42 rwrapper.xml.in
-rw-rw-r-- 1 romain romain   1442 2009-01-23 12:42 test.py
-rw-rw-r-- 1 romain romain   9349 2009-01-23 12:42 wrapper.c
-rw-r--r-- 1 romain romain     27 2009-01-23 11:54 AUTHORS
The first thing to do is to customize the wrapper so that it is called rwrapper instead of the default wcode. This is achieved by the customize script:
$ ./customize rwrapper
The files myCFunction.* are useless and you can remove them at that point, we won't need the code_C1.c file either since we are going to write an R script instead.
$ rm myCFunction.* 
$ rm code_C1.c
$ ll
total 288
-rw-r--r-- 1 romain romain     27 2009-01-23 11:54 AUTHORS
-rwxr-xr-x 1 romain romain   1304 2009-01-23 11:54 bootstrap
-rw-r--r-- 1 romain romain 199260 2009-01-23 11:54 ChangeLog
-rw-r--r-- 1 romain romain    216 2009-01-23 11:54 code_C1.data
-rw-rw-r-- 1 romain romain   1594 2009-01-23 12:42 configure.ac
-rw-r--r-- 1 romain romain  18002 2009-01-23 11:54 COPYING
-rwxr-xr-x 1 romain romain   1794 2009-01-23 11:54 customize
-rw-r--r-- 1 romain romain   9498 2009-01-23 11:54 INSTALL
drwxr-xr-x 2 romain romain   4096 2009-01-23 11:54 m4
-rw-rw-r-- 1 romain romain    571 2009-01-23 12:42 Makefile.am
-rw-r--r-- 1 romain romain      0 2009-01-23 11:54 NEWS
-rw-r--r-- 1 romain romain    925 2009-01-23 11:54 README
-rwxrwxr-x 1 romain romain    435 2009-01-23 12:03 rwrapper.R
-rw-rw-r-- 1 romain romain   3722 2009-01-23 12:42 rwrapper.xml.in
-rw-rw-r-- 1 romain romain   1442 2009-01-23 12:42 test.py
-rw-rw-r-- 1 romain romain   9349 2009-01-23 12:42 wrapper.c
Next, we need to write the R script that does the actual work, it needs to grab input file and output file, read data from the input file and write data to the output file. Something like that :
#!/usr/bin/env Rscript

# grab arguments
argv <- commandArgs( TRUE )
datafile <- argv[1]
outfile  <- argv[2] 

# read data from data file 
rl <- readLines( datafile )
extract <- function( index = 1 ){
  rx <- sprintf( "^(I%d *= *)(.*)$", index )
  as.numeric( gsub( rx, "\\2", grep(rx, rl, value = TRUE ) ) ) 
}
x1 <- extract( 1 )
x2 <- extract( 2 )
x3 <- extract( 3 )

out <- x1 + x2 + x3
cat( "O1 = ", out, sep = "", file = outfile )

Next, we need to modify the Makefile.am file so that the make install step copies the rwrapper.R file into the wrappers/bin directory later.
ACLOCAL_AMFLAGS = -I m4

wrapperdir          = $(prefix)/wrappers

wrapper_LTLIBRARIES = rwrapper.la
wcode_la_SOURCES    = wrapper.c
wcode_la_CPPFLAGS   = $(OPENTURNS_WRAPPER_CPPFLAGS)
wcode_la_LDFLAGS    = -module -no-undefined -version-info 0:0:0
wcode_la_LDFLAGS   += $(OPENTURNS_WRAPPER_LDFLAGS)
wcode_la_LIBADD     = $(OPENTURNS_WRAPPER_LIBS)

XMLWRAPPERFILE      = rwrapper.xml
wrapper_DATA        = $(XMLWRAPPERFILE)
EXTRA_DIST          = $(XMLWRAPPERFILE).in test.py code_C1.data

execbindir          = $(prefix)/bin
execbin_DATA        = rwrapper.R
Then, we need to make a few changes to the rwrapper.xml.in file. Here is the definition of the output variable:
        <variable id="O1" type="out">
          <comment>Output 1</comment>
          <unit>none</unit>
          <regexp>O1\S*=\S*(\R)</regexp>
        </variable>

You also need to add the subst tag in the output file definition (at least with this version of openturns) :
      <!-- An output file -->
      <file id="result" type="out">
        <name>The output result file</name>
        <path>code_C1.result</path>
        <subst>O1</subst>
      </file>
 
and then change the command that invokes the script as follows:
    <command>Rscript @prefix@/bin/rwrapper.R code_C1.data code_C1.result</command>

Download the full rwrapper.xml.in file Once this is done (you can grab a tar.gz of the wrapper at that stage) , you can compile the wrapper by following these steps:
$ ./bootstrap
$ ./configure --prefix=/home/romain/openturns --with-openturns=/usr/local
$ make 
$ make install
If all goes well, you should have a rwrapper.R file in the ~/openturns/bin directory and a file rwrapper.xml in the ~/openturns/wrappers directory
Before trying the wrapper, we need to copy the input file in the directory where we are going to run openturns (say /tmp)
$ cp code_C1.data /tmp
$ cd /tmp
Now we are good to go and can start using the wrapper from open turns:
$ python
>>> from openturns import *
>>> p = NumericalPoint( (1,2,3))
>>> f = NumericalMathFunction( "rwrapper" )
>>> print f(p )
class=NumericalPoint name=Unnamed dimension=1 implementation=class=NumericalPointImplementation name=Unnamed dimension=1 values=[6]
>>> 1+2+3
6
The drawback of this approach is that each time the function needs to be evaluated, a new R session will be launched by Rscript, depending on the number of iterations we want to do this can affect seriously the run time of the study. A way to get around this is to use a single R session and let the wrapper communicate with it. I can see at least two ways to do it:
  • by writing the function in python and let python communicate with R (using rpy for instance)
  • by writing a c wrapper that would initialize a connection to an R server when the function is created, and call it whenever the function needs to be called
I'll try to tell these stories in another post

Wednesday, January 21 2009

python code in sweave document

It would be great if we could not only use R or S in sweave code chunks but also some other languages such as python for example. Why would you want that, well python has some additional graphics capabilities R does not have, some software is written in python but you still want to write your document in sweave, ... Here is a first attempt, obviously not complete.

A custom sweave driver

The first trick is to write a custom sweave driver, based on the basic RweaveLatex driver which does something with the content of a chunk when the engine is set to python :

driver <- RweaveLatex() 
runcode <- driver$runcode
driver$runcode <- function(object, chunk, options){
if( options$engine == "python" ){
driver$writedoc( object, c("\\begin{python}", chunk, "\\end{python}") )
} else{
runcode( object, chunk, options )
}
}
Sweave( "python.Rnw", driver = driver )
The only thing the driver does is convert python code chunks into a python environment, so that this in the Rnw file:
<<hello,engine=python>>=
print "hello"
print "world"
@
becomes that in the tex file:
\begin{python}
print "hello"
print "world"
\end{python}

Process the python code


Then you need to install the python package into your texmf tree and texhash (just google around if you don't know what it means). The python package defines the python environment so that when you compile the tex file, latex calls python and brings back the output of the python script. The catch is that you need to compile your tex file with the option -shell-escape.
$ pdflatex -shell-escape python.tex

Beyond the simple trick


So we can get hello world from python, this needs more thinking to enable:
  • production of graphics from python with a fig option, just like you do it in R, see this for example
  • some way to share the data between R and python so that variables created in the R world could be used in the python world and vice-versa, I don't know the best way to do that at the moment, but from the  top of my head we could either use rpy for the communication or the database that gets generated by the cacheSweave package

Monday, January 19 2009

Install open turns on fedora 10

Introduction

After spending quite some time to install openturns on my fedora box, I feel I should post about it to spare the time of other people. The install page advertises for a forthcoming support for RPM packages available soon (we all know the real definition of soon don't we : it means "we don't need it for ourselves, so if you want it, do it", which is fair enough).

If I had more time, I would learn how to make rpms, and provide one for openturns, but this does not seem necessary for now as openturns installs fine from source, at least if you go round a few things. This post is absolutely not a replacement for the real install notes but maybe guidelines on how to read these notes from a fedora perspective.

Download

Grab the tar.gz from sourceforge and unzip it somewhere.

Dependencies

This is what the install notes say :

Till 0.12.1 included:

* GCC C, C++ and Fortran compilers (>= 3.3.5 except 4.0.x series, tested with 3.4.5, 3.4.6, 4.1.1, 4.1.2 & 4.2.2) * Python interpreter (>= 2.4.x) * R statistical language (>= 2.0) * Xerces-C XML parser (>= 2.6.0, tested with 2.7.0) * BOOST C++ library (>= 1.33.1) * LAPACK Linear Algebra library (>= 3.0) * Qt (3.3.x) * python-qt if you want to use the embedded image viewer ViewImage (TUI only)

Since 0.12.2:

* GCC C, C++ and Fortran compilers (>= 3.3.5 except 4.0.x series, tested with 3.3.5, 3.3.6, 3.4.5, 3.4.6, 4.1.1, 4.1.2, 4.2.2 & 4.3.1) * Python interpreter (>= 2.4.x) * R statistical language (>= 2.0) * Libxml2 XML library (>= 2.6.27) * LAPACK Linear Algebra library (>= 3.0) * python-qt if you want to use the embedded image viewer ViewImage (TUI only)

Here is what I have done on my fedora machine: python and gcc are already installed unless you really want them not to be, so nothing to do here, R is easy to compile from source, but you can get it with yum as well (yum install R)

For the other software, here is my list of yum calls :

# yum install -y xerces-c-devel
# yum install -y boost-devel
# yum install -y lapack-devel
# yum install -y qt3-devel
# yum install -y PyQt-devel
# yum install -y libxml2-devel

I also installed rpy and graphviz to have optional features as well:

# yum install -y rpy
# yum install -y graphviz-devel

After that, the ./configure call should be ok. Here is the summary I got which sounds good enough.

R Packages

Now you can install the R package rotRPackage which comes with openturns as described in the install page

# R CMD INSTALL utils/rotRPackage_1.4.4.tar.gz

You also need the sensitivity package, but at the time of writing the sensitivity package changed some of its API and openturns did not propagate, so you have to install version 1.3-1 as opposed to the current version.

The other problem I ran into was that I am using a custom ~/.Rprofile file which contains startup instructions such as requireing R packages, this caused the test cases of openturns to fail because the expected output was mixed with the standard error stream (which is where require writes its messages). So at least for running openturns tests, I have modified my .Rprofile file so that it does not load packages or write anything to the standard error stream.

Installing openturns

When this is ready, you can do :

$ make   # good opportunity to make some coffee while it compiles 
$ make check # everything should be ok
# make install
$ make installcheck # should be ok too

Loading the python module

Reading the FAQ is a good way to save yourself some time, specifically when trying to load the openturns python module. I have added these two lines to my .bash_profile file :

PYTHONPATH=/usr/local/lib/python2.5/site-packages/openturns
export PYTHONPATH

Then, you can start python and start using openturns, which is another story ...

$ python
Python 2.5.2 (r252:60911, Sep 30 2008, 15:41:38)
[GCC 4.3.2 20080917 (Red Hat 4.3.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from openturns import *