-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use leader election for better compatibility with in-place upgrades #492
Use leader election for better compatibility with in-place upgrades #492
Conversation
…)" This reverts commit 2a917ea.
The previous commit reverts 2a917e, which changed the configuration of the Deployment for the manager to use `Recreate` strategy. Such change was for better compatibility for in-place upgrades (see the commit message of the reverted one). Instead of changing the Deployment strategy, this commit is enabling leader election in the manager. The leader election also solves the issues mentioned in commit 2a917e. Also, not changing the Deployment strategy works better with odh-operator. Signed-off-by: Edgar Hernández <[email protected]>
I'm not familiar with all the details of kserve, but looks good to me |
@israel-hdez |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danielezonca, israel-hdez, spolti, zdtsw The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There is a pr on upstream. |
/lgtm |
5bfa8ea
into
opendatahub-io:release-v0.14
@danielezonca Yes, it is here: kserve#4234 |
What this PR does / why we need it:
Following up with comments here: #491 (comment)
The reverts commit 2a917ea (PR #491), which changed the configuration of the Deployment for the manager to use
Recreate
strategy. Such change was for better compatibility for in-place upgrades (see description of PR #491).Instead of changing the Deployment strategy, this is enabling leader election in the manager. The leader election also solves the issues mentioned in PR #491. Also, not changing the Deployment strategy works better with odh-operator.
Which issue(s) this PR fixes
Fixes https://issues.redhat.com/browse/RHOAIENG-18977
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Similar testing as in PR #491. However, logs should reveal that when duplicating the kserve-controller Deployment, the second deployment should wait until it is capable of acquiring the lease.
Special notes for your reviewer:
On the very first upgrade, despite the updated configuration, the new version of the manager would still run in parallel along the old version. This is expected, because the older version doesn't have enabled leader election. This should be OK, as we still don't promote InferenceGraphs as supported in ODH. Once this change is released, on following ODH upgrades we should observe the expected behavior of the new version not fully booting until it acquires the lease to be the leader. This is the reason for testing this PR by duplicating the deployment.
Checklist: