Skip to content

Some suggestions for additional functionality for launching jobs with OOD #55

@markyashar

Description

@markyashar

I just wanted to provide some suggestions here regarding additional functionality within the OOD forms that are presented to users when launching a job from within OOD (for example, Jupyter Server: compute via SLURM using Savio partitions).

When I attended a recent workshop on the NAIRR Pilot, I had a chance to try launching jobs on ACCESS HPC systems such as EXPANSE (SDSC), PEGASUS, DELTA/DELTA AI (NCSA) via the different OOD systems that each of these HPC systems use. I saw that there were some options and functionality that they built into their OOD platform systems that we don't seem to have for the Savio OOD platform. I just wanted to describe them here in case it would be worthwhile to consider or explore whether such functionality (as relevant for Savio) could be added to the Savio OOD platform. For reference purposes, I've attached the "Expanse Portal User Guide" document, which goes into more detail on all of this.

Expanse_Portal_Userguide.pdf

As a specific example, when a user wants to launch a Jupyter Server / JupyterLab compute job via SLURM, a form is presented with different form options (editable text boxes) for launching a job. Here is a list of some examples of some options and functionality (i.e., the user fills out the editable text box option) that were available on the ACCESS OOD platforms that the OOD Savio platform does not have:

  • An option for choosing the working directory (with $HOME being the default) : Select your project directory.
  • An option for choosing the amount of RAM: Default RAM assigned if left blank. Use SLURM format, e.g., 4096 MB, 10 GB, etc. If left blank, 1000 MB will be allocated per CPU requested, or
  • Memory required per node (GB): Enter the maximum memory required in units of GB. Default is 2 GB
  • Apptainer Image File Location: If you want to launch your Jupyter notebook (or lab) session using an Apptainer container that has Jupyter installed within the container, then please provide the full path to the container here. You can find a number of existing Apptainer containers built and maintained by BRC staff for use on Savio at .... Two of the most popular Apptainer containers are the latest Pytorch and TensorFlow containers located at ....
  • Environmental modules to be loaded: Enter any environment modules that you require for your Jupyter session. For example, if you're attempting to launch your Jupyter session from a Apptainer container, then please enter 'apptainerpro' (this is just an example) in the field to load the standard apptainer module available on Savio . If you want to run your Jupyter notebook (or lab) session from a conda environment on CPUs, then you can load the latest version of the system-wide installed Anaconda distribution available on Savio by entering the following module: 'anaconda3/...' As shown here, loading multiple modules requires them to be listed in a comma separated list.
  • Conda Environment: Enter the name of your own custom conda environment.
  • Conda Init: Enter the path to Conda initialization scripts.
  • Conda Yaml: Upload a Yaml file to build the conda environment at runtime.
  • Reservation (optional): Enter if you have a system reservaton for running your jobs.
  • Working directory: Choose your Jupyter working directory. Your $HOME directory or your Savio scratch directory are provided here as presets in the dropdown menu. You may also use any other working directory on Savio by providing the full path to the directory in the editable text box.
  • Type: Choose the type of Jupyter interface you wish to use for your interactive session. Note, your software environment -- whether provided by an Apptainer container, a Conda environment, or some other mechanism -- must have Jupyter installed, e.g., if you onely have Jupyter with the older notebook interface support installed, but want to use JupyterLab, you need to first install JupyterLab support as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions