-
Notifications
You must be signed in to change notification settings - Fork 162
Added important Native job info as Message for Workflows #4831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I like this feature, and have no issues with the implementation. I do wonder if it would be helpful to convert the |
|
I am thinking of converting all Will add this in ProjectHelper. |
I wonder if this will be a future extension of ood_core, so that people can add human readable versions for each scheduler separately, and access it through the adapter. Definitely not something we should tackle now, but that is likely how we will get around the scheduler issue if we do it later. |
Bubballoo3
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach of assembling those fields together into a single message, and added some notes on how we can make sure to do this clearly and consistently across schedulers.
| end | ||
|
|
||
| # Special case: Showing dependency of cancelled job will confuse users | ||
| if state=="CANCELLED" and reason=="Dependency" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per our style guide, we should use && in place of and both here and on line 34.
| msg = "Current job state is #{state}" | ||
|
|
||
| reason = native.dig(:reason) | ||
| if reason != "None" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mention in the opening comment that
I have not added any human readability for reasons as the string are different for each scheduler.
Are we still able to rely on 'None' and 'Dependency' as being consistent across schedulers? Otherwise this seems error-prone and we should find an agnostic way of determining how we want to assemble this message.
For example another way to be super clear about what is happening (or what strings are missing/'None") is to put every variable inside labelled quotes, so something like
" Job has current state:'JOBSTATE' because of reason:'REASON'. Waiting for dependency:'DEPENDENCY' to be satisfied"
That way we don't care what the strings themselves actually are, or even if they are present, as having a nil value for any of those could be useful for debugging what is actually going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of ' ' is nice, then users can google search the code and check for readable reason/state.
Nice catch, in PBS / Torque;
:state => :job_state
:reason => :comment (human readable)
:dependency => :depend
:afterok is standard in all scheduler
|
That all being said I do really like the message content how it is, and would hate to make it uglier with a less elegant composition. So my ideal outcome is that we ensure every conditional is scheduler agnostic and keep your message structure the same. |
I agree, I removed the special case as it will just make the parsing uglier |
|
So this now accounts for slurm and pbspro? I think that covers the vast majority of users, and is probably a good compromise that keeps us from having to mess with ood_core right now. Now that I look at the adapters, I am not even sure if workflows will be supported on some of the other schedulers. We should likely figure that out and include some guardrails. Have you looked into workflows support at all? |
|
Workflow is supported on all the schedulers as I was looking for what scheduler OOD supports - |
|
To be exhaustive we would want to cover all the adapters in https://github.com/OSC/ood_core/tree/master/lib/ood_core/job/adapters. will raise an error when an afterok parameter is passed. Digging directly out of the native is bad practice for this reason, but might be alright given the limited time we have to finish this. So I would say check out each adapter, and if the afterok parameter is accepted and passed to the scheduler, then we should also try and find the native key that this info will be stored under. Long term, we can implement this in ood_core and handle all the conversion there. |
|
Also having conditionals for a lot of different schedulers will get very messy, not to mention the work involved figuring out what the proper keys are, so if you want to defer until @johrstrom can give us some advice that might be a better use of your time. If you do want to take a stab at it it wouldn't be bad though. |

Workflows Epic Link - #4338
State,DependencyandReason(why the process is in the current state) into a single field calledMessageI have not added any human readability for
reasonsas the string are different for each scheduler. This will hinder scalability in future.