-
-
Notifications
You must be signed in to change notification settings - Fork 335
Correct Dynamic Queue logic for CLOUD_PROVIDER and Static node combinations #6776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Andrew Leonard <[email protected]>
Signed-off-by: Andrew Leonard <[email protected]>
Signed-off-by: Andrew Leonard <[email protected]>
|
It's starting to feel like we need a state machine here... |
smlambert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @andrew-m-leonard !
FYI - @JasonFengJ9 @pshipton
|
If static nodes are available they should be used first. Is this still expected behaviour for now? Or it doesn't matter to use dynamic ones or static ones? |
This PR fixes the behaviour. Somehow over the past year, that functionality got 'broken', to only use dynamic agents if CLOUD_PROVIDER is set. Andrew has restored the behaviour to use idle static nodes first, then spin up dynamic agents if all static nodes idle. |
If BOTH a static node and a dynamic node(ie.it hasn't been released after doing work..) is available, then it can use either I think that's reasonable. |
|
Instead of changing the order of if clause if (!areNodesWithLabelOnline(LABEL)) {} does one line of change |
Yes, if a Static node is "available", then it will get used first, although if both a "Static" & a "Dynamic" are available, ie.a previous provisioned Dynamic has just finished some work and is "idle" and not yet been returned to the pool, then either could be used... |
Agree, and especially as we may configure via the Jenkins plugin how long a particular dynamic agent stays alive once it is spun up... |
The problem previously was if (!areNodesWithLabelOnline(LABEL)) is not checking "available"... it is only checking if "online". So i've changed the logic to basically:
|
The other problem with that was also if no Static's are online, then the Dynamic logic was ignored and a Timeout abort occured |
|
Also tested on a local Jenkins with no Cloud dynamic agents, and if CLOUD_PROVIDER is specified for any reason it will correctly just use the Statics (previously it would hang) |
|
re: #6776 (comment) @andrew-m-leonard - if there is not already a clear comment in that piece of code, it might be good to add your clarifying comment from above to guide others looking at that code in the future |
yep, i'll beef up the comments I added, thanks |
Signed-off-by: Andrew Leonard <[email protected]>
|
In this case I feel like a static will be used first but meanwhile the dynamic one will still be spun up since node(Label) has 'ci.agent.dynamic ' ?
|
Not sure that matters, but I don't think Jenkins provisioning will trigger a Cloud plugin when it is scheduling it to a Static... My test above didn't do that either, there were 10 Static nodes available when I triggered the sanity.openjdk, all 3 went on the Static nodes, no Dynamics were spun up. I then triggered the extended.openjdk which requested 10 nodes, the first 7 went on the Static nodes, the remaining 3 then waited 2mins or so for the Dynamic agents to spin up. So Jenkins did a great job! |
Fixes #6773
Concurrent tests with CLOUD_PROVIDER=azure, with 10 "static" nodes available: