Romain Francois, Professional R Enthusiast - Snippet - CommentsIndependant statistical/R consultant2013-03-24T15:53:22+01:00Romain Francoisurn:md5:2cdb21a695f56bfe2b31ee2133c51b42DotclearCode Snippet : List of CRAN packages - Romain Francoisurn:md5:29721df85a9b30b7cd8a7447d891eb9e2009-10-13T20:52:33+02:00Romain Francois<p>Cool. Thanks.</p>
<p>I should have used perl = TRUE in the sub call to fix the problem #2.</p>Code Snippet : List of CRAN packages - Rahul Premrajurn:md5:a5884bbb40f39f89f834644f7a3777772009-10-13T15:45:02+02:00Rahul Premraj<p>Joys of pasting to HTML! Spotted two errors in the copied script above. Also, I modified the statement to create the URL to access html page.</p>
<p>Fixed script below:</p>
<p>index = paste(getOption("repos"), "/web/packages/index.html", sep = "")<br />
html <- readLines( index )<br />
html <- grep( "./../web/packages/", html, value = TRUE )<br />
data <- sub( '^.*index.html">(.*?)(.*?)$', "\\1 @@ \\2", html, perl = TRUE )<br />
data <- gsub( '<[^>]+>', ' ', data )<br />
data <- trim(gsub( '@@', "", data))<br />
packages <- do.call( rbind, strsplit( data, " {3,3}" ) )<br />
head( packages, 20 )</p>Code Snippet : List of CRAN packages - Rahul Premrajurn:md5:c496078cff34c6b9a939a6216c8fe80f2009-10-13T15:29:35+02:00Rahul Premraj<p>Thanks for this! Quite a neat trick <img src="/themes/default/smilies/smile.png" alt=":-)" class="smiley" /></p>
<p>Unfortunately, the script didn't run so smoothly on Mac OS for the following reasons:</p>
<p>1. index <- sub( "src/contrib", "web/packages/index.html", repo ) didn't work because the returned value for contrib.url(getOption("repos")) seems to be different on Mac OS.</p>
<p>2. data <- sub( '<.*?>', '', data ) replaced everything between the first "<" and last ">" instead of the first "<" and the next ">". In turn, the description of the package got erased.</p>
<p>The following fixed script worked fine on my machine:</p>
<p>repo <- contrib.url(getOption("repos"))<br />
index <- gsub( "bin.*", "web/packages/index.html", repo)<br />
html <- readLines( index )<br />
html <- grep( "./../web/packages/", html, value = TRUE )<br />
data <- sub( '^.*index.html">(.*?)(.*?)$', "\\1 @@ \\2", html, perl = TRUE )<br />
data <- gsub( '<<a href="http://romainfrancois.blog.free.fr/index.php?post/2009/08/05/^>" title="^>" rel="nofollow">^></a>+>', ' ', data )<br />
data <- trim(gsub( '@@', "", data))<br />
packages <- do.call( rbind, strsplit( data, " " ) )<br />
head( packages, 20 )</p>