Skip to content

Conversation

@bertiethorpe
Copy link
Member

@bertiethorpe bertiethorpe commented Sep 12, 2025

  • Upgrade Open OnDemand to v4 (from v3)
  • Fixes fatimage not installing ondemand app packages
  • Increases fatimage size to 20GB (from 15GB)

NB: The password for "code server app" sessions can be found using My Interactive Sessions > Session ID > connection.yml

Open Ondemand v4 adds a cluster status page:
image

image

@bertiethorpe bertiethorpe requested a review from a team as a code owner September 12, 2025 11:19
@sjpb sjpb changed the title Bump OSC's Openondemand v4 & Fix OOD app packages install Bump OpenOndemand to v4 & install apps in fatimage Sep 19, 2025
@sjpb sjpb changed the title Bump OpenOndemand to v4 & install apps in fatimage Bump OpenOnDemand to v4 & install apps in fatimage Sep 19, 2025
@sjpb sjpb changed the title Bump OpenOnDemand to v4 & install apps in fatimage Bump Open OnDemand to v4 & install apps in fatimage Sep 19, 2025
@sjpb sjpb self-requested a review September 19, 2025 09:19
Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also needs some notes in a comment about what testing has been carrried out on ondemand.

@sjpb sjpb self-requested a review September 19, 2025 12:23
@sjpb
Copy link
Collaborator

sjpb commented Sep 19, 2025

@sjpb
Copy link
Collaborator

sjpb commented Sep 19, 2025

Image builds failing with:

==> openstack.openhpc: TASK [dnf_repos : Install epel-release] ****************************************
==> openstack.openhpc: task path: /home/runner/work/ansible-slurm-appliance/ansible-slurm-appliance/ansible/roles/dnf_repos/tasks/set_repos.yml:22
==> openstack.openhpc: Friday 19 September 2025  13:30:05 +0000 (0:00:05.997)       0:00:54.961 ******
==> openstack.openhpc: fatal: ***: FAILED! => {
==> openstack.openhpc:     "changed": false,
==> openstack.openhpc:     "rc": 1,
==> openstack.openhpc:     "results": []
==> openstack.openhpc: }
==> openstack.openhpc: 
==> openstack.openhpc: MSG:
==> openstack.openhpc: 
==> openstack.openhpc: Failed to download metadata for repo 'OpenHPC': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried

@sjpb
Copy link
Collaborator

sjpb commented Sep 19, 2025

Looks like dnf_repos_password isn't being set. Run without deleting build VM on failure:
https://github.com/stackhpc/ansible-slurm-appliance/actions/runs/17860113047
[edit Apparently leafcloud s3 is struggling]

@sjpb
Copy link
Collaborator

sjpb commented Sep 24, 2025

@sjpb
Copy link
Collaborator

sjpb commented Sep 25, 2025

OnDemand testing (on RL8):

  • Monitoring: OK
  • Status dashboard: OK
  • Shell: OK (sinfo, module av)
  • Files: OK
  • Remote desktop: OK (in terminal: sinfo, module av)
  • Code server: OK
  • Jupyter notebook: OK
  • Rstudio: OK

@sjpb sjpb self-requested a review September 25, 2025 09:28
Copy link
Collaborator

@sjpb sjpb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment before this for testing

@sjpb sjpb requested a review from wtripp180901 September 25, 2025 09:31
@sjpb sjpb merged commit 4548b9b into main Sep 25, 2025
34 checks passed
@sjpb sjpb deleted the feat/update-osc-ood branch September 25, 2025 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants