Following discussions with @atteggiani this morning, we have need of a dedicated R module environment for the modelevaluation.org analysis package.
Background
ModelEvaluation.org is a server application hosted by UNSW which accepts model output (i.e. from CABLE), runs analyses, and generates standard plots for model intercomparison and benchmarking. This is currently facilitated by 2 ACCESS-NRI middleware packages:
- https://github.com/CABLE-LSM/benchcab, a package and compiles and runs CABLE across branches.
- https://github.com/CABLE-LSM/meorg_client, a REST client that interacts with the ModelEvaluation.org API to upload module output and trigger remote analyses.
The bottleneck we are facing is that the analyses are run on dedicated workers on a remote VM, which as the service increases in use will become hamstrung by resource flexibility limitations.
As such, there is a package of work underway to port the analysis software of MEORG to Gadi to run on the queue with flexible resource allocation facilitated by https://github.com/ACCESS-NRI/hpcpy (yet another ACCESS-NRI package).
The analysis software itself is a custom R package (called PalsR) with specific dependencies, listed below, which we would like to have pre-installed and available for use via a simple module load.
module use /g/data/vk83/modules
module load palsr/latest
The intention is to use HPCpy to facilitate the submission of queued jobs, within those jobs we will simply load the module and trigger a local analysis. Then, we will upload (via the MEORG client) the analysis results to modelevaluation.org. This will vastly reduce the server traffic, increase flexibility of resource allocation (due to jobs of varying size) and allow us to make use of the Gadi filesystem (where the files already reside after having been run with benchcab).
Based on the aforementioned discussion, it seems that setting up a dedicated module environment using this infrastructure makes the most sense, as R is largely installable via Conda.
Requirements
- R
- R-devel or equivalent R build tooling
make and compiler toolchain
- NetCDF and HDF5 development libraries
- zlib development library
- Git
- R packages:
ncdf4, RJSONIO, plotrix, colorRamps
- Recommended extras:
colorspace, scales
- And of course, the PalsR package itself, hosted on an external DVCS maintained by UNSW
UNSW has provided some installation instructions for their software (largely within a docker config), which we should be able to reverse engineer to set up a dedicated R environment to run the package.
Let's discuss further details in this issue, I would appreciate some direction on how to get started using the examples provided in this repo.
Following discussions with @atteggiani this morning, we have need of a dedicated R module environment for the modelevaluation.org analysis package.
Background
ModelEvaluation.org is a server application hosted by UNSW which accepts model output (i.e. from CABLE), runs analyses, and generates standard plots for model intercomparison and benchmarking. This is currently facilitated by 2 ACCESS-NRI middleware packages:
The bottleneck we are facing is that the analyses are run on dedicated workers on a remote VM, which as the service increases in use will become hamstrung by resource flexibility limitations.
As such, there is a package of work underway to port the analysis software of MEORG to Gadi to run on the queue with flexible resource allocation facilitated by https://github.com/ACCESS-NRI/hpcpy (yet another ACCESS-NRI package).
The analysis software itself is a custom R package (called PalsR) with specific dependencies, listed below, which we would like to have pre-installed and available for use via a simple module load.
The intention is to use HPCpy to facilitate the submission of queued jobs, within those jobs we will simply load the module and trigger a local analysis. Then, we will upload (via the MEORG client) the analysis results to modelevaluation.org. This will vastly reduce the server traffic, increase flexibility of resource allocation (due to jobs of varying size) and allow us to make use of the Gadi filesystem (where the files already reside after having been run with benchcab).
Based on the aforementioned discussion, it seems that setting up a dedicated module environment using this infrastructure makes the most sense, as R is largely installable via Conda.
Requirements
makeand compiler toolchainncdf4,RJSONIO,plotrix,colorRampscolorspace,scalesUNSW has provided some installation instructions for their software (largely within a docker config), which we should be able to reverse engineer to set up a dedicated R environment to run the package.
Let's discuss further details in this issue, I would appreciate some direction on how to get started using the examples provided in this repo.