Skip to content

Conversation

simonLeary42
Copy link
Contributor

The definition for QueueInfo.qos is "The QoSes associated with this queue", which is a little vague. In Slurm, Partition QOS is a rather niche feature:

A QOS can be attached to a partition. This means the partition will have all the same limits as the QOS. This does not associate jobs with the QOS, nor does it give the job any priority or preemption characteristics of the assigned QOS. Jobs may separately request the same QOS or a different QOS to gain those characteristics. However, the Partition QOS limits will override the job's QOS. If the opposite is desired you may configure the job's QOS with Flags=OverPartQOS which will reverse the order of precedence.

AccountInfo.qos shows which QOSes the user can use to schedule jobs via that account. JobInfo.qos shows which QOS a job is using. QueueInfo.qos pertains only to the resource limits being enforced by slurm, and is completely opaque to the user even in the Slurm command line. If OOD is to be a "rosetta stone" and maintain a model of resource management with maximum compatibility, I don't think this should be part of that model.

I can't find any usage of this attribute. Please strike down this PR if I'm just not understanding or if this attribute is important.

note: it's assumed to be a list, but slurm documentation implies that it should be just one QOS. I tried to make slurm barf with SLURM_CONF=/tmp/slurm.conf.test scontrol show partition foobar, with multiple QOS for the foobar partition in /tmp/slurm.conf.test, but it semeed to ignore my file and just print the partition information from production. I don't currently have a containerized slurmctld that I can break, so I'm not sure if multiple partition QOS is an error or not.

@johrstrom
Copy link
Contributor

I hesitate to break QoS as we use it upstream in auto_qos

github.com/OSC/ondemand/blob/4acd0a2a3ac3cb46787f757005255d1890be10f2/apps/dashboard/app/lib/account_cache.rb#L92

but it appears we pull the qos from AccountInfo objects, not the QueueInfo, so maybe it's OK?

@simonLeary42
Copy link
Contributor Author

Yeah, AccountInfo.qos is important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants