-
Notifications
You must be signed in to change notification settings - Fork 122
return tres_per_task as a dict #384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
thanks for the patch and ideas.
I think I've mainly held back on the these tres_per_*
things, because I wasn't sure if just returning a dict is good enough, or whether I'd want to introduce some other type for it.
Another thing was that most of the info these strings have is already available through dedicated attributes on the job instance, like cpu
, cpus_per_task
, memory
, memory_per_cpu
, memory_per_node
, gres_per_task
, ntasks
, etc.
I think I've wanted people to use these dedicated attributes instead, so I left them out for the moment. Also some of them seem to be really confusing in the slurm api: tres_per_job
just contains GRES stuff (why name it tres_per_job
then?), while tres_per_task
seems to include everything else. And the others, I don't even know :)
Anyway, this small rant aside: there is no drawback of exposing the tres_per_task
thing as an additional option to use, so we can add that :)
23c9d69
to
6b4ae20
Compare
thanks for being ok with adding tres_per_task. that will help us. agree on the rant :) tres's seems all a bit inconsistent with a lot of gres syntax in there still. I've updated the patch to use the to_dict() as you suggested. that is much better. thanks! |
I just tested the I tried it with multiple Jobs, but |
Hi, oh, really? darn. what do you see from this? FWIW about 2/3 of our current running and submitted jobs have whether a job has a TresPerTask is very dependent upon the job submission args. eg. or BTW, Re: future devel: cheers, |
Hey, sorry for the late response. Yeah I goofed up and didn't actually request anything like --cpus-per-task=5, so it wasn't showing up. Everything works now. In regards to adding the other Have a nice weekend. |
the old job API had tres_req_str which we were using when folks request gpus-per-task. eg. requests like this
#SBATCH --nodes=6-24 --ntasks=24 --cpus-per-task=1 --gpus-per-task=1 --mem-per-cpu=8g
I couldn't find anything equivalent to tres_req_str in the new Job API, so I added this code that parses tres_per_task into a dict in a vaguely similar way to how you do gres_per_task. ie.
cpu=1,gres/gpu=1 --> {'cpu': 1, 'gres/gpu': 1}
full disclosure - this is my first attempt at cython code, so is probably sketchy :)
sadly(?) there are a bunch of other tres_per_{job,node,task,socket} things in slurm now, not just this one.
however this patch solves our problem for the moment.
anyway, I thought I'd contrib this back and maybe it'd kickstart a discussion on how/if you want to support these kind of tres_per_* things, and if you just want to pass back a _str, or if a dict is nicer, or ... ?