This blog: http://blogr-cs.blogspot.com/2012/12/integration-of-r-rstudio-and-hadoop-in.html tells how to modify a Cloudera VM to include R and RStudio in the VM as well as the RHadoop library. This document shows some modifications to the steps to support the Hortonworks Sandbox.
Sandbox VM as seen in VMware Fusion
The IP address in blurred out. I simply installed the latest Sandbox VMware .ova file. When I downloaded the file from Hortonworks, the saved file was given the extension .ovf and this gives Fusion (5.03) trouble. Manually changing to .ova made the import work. Hortonworks import instructions are available here. The actual instructions to make the R, RStudio and the rmr2 library are found in the following history listing from my ssh session to the VM.
An image of the steps needed for the install
The text for these commands for cut-n-paste convenience:
sudo rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm sudo yum -y install git wget R ls /etc/default sudo ln -s /etc/default/hadoop /etc/profile.d/hadoop.sh cat /etc/profile.d/hadoop.sh | sed 's/export //g' > ~/.Renviron wget http://download2.rstudio.org/rstudio-server-0.97.332-x86_64.rpm sudo yum install --nogpgcheck rstudio-server-0.97.332-x86_64.rpm sudo R wget --no-check-certificate http://goo.gl/uV6Y9 sudo R CMD INSTALL rmr2_2.1.0.tar.gz
After the sudo R step, you need to install some R package prerequisites:
install.packages( c('RJSONIO', 'itertools', 'digest', 'Rcpp', 'functional', 'plyr', 'stringr'), repos='http://cran.revolutionanalytics.com') install.packages( c('reshape2'), repos='http://cran.revolutionanalytics.com')
The URL in the second wget statement comes from the current location of the latest rmr2 library as documented here.
After this you should be able to run RStudio in your browser at port 8787.