-
Notifications
You must be signed in to change notification settings - Fork 48
Survive minor db drops #427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
38ab4de to
f0bc6ed
Compare
2d3dd9d to
e1a9865
Compare
|
Thanks, @adamruzicka ! IIUC, this will help if a DB won't be available for cca 1 minute, and if more it will just kill the process now? If so, apologies for the newbie question, but looking at the related tickets, how will that help users to recover from the DB drop for e.g. 15 minutes? Is it a clear state that dynflow process is dead and needs to be started, is it that something will automatically try to restart the process since it's dead now? |
Yup, that's about right. The timing isn't exact, but the breaking point between carrying on and crashing hard is somewhere around that 1 minute mark.
In Foreman deployment, systemd should attempt to restart it, once we move to containers I'd expect again systemd or the container runtime to monitor it and restart if needed. In worst case without anything restarting it, this would at least clearly mark it as being down, rather than leaving it "up" in a broken state. |
ofedoren
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the explanation in #427 (comment) and that I'm not much familiar with dynflow internals, I don't have any concerns looking at the code changes themselves.
In other words, LGTM, but how would you like to proceed with the commits before merge?
I'd prefer to get your blessing on #462 , squash and merge that and then rebase this one on top |
7211e39 to
8f1596d
Compare
Essentially it means wrapping all db interactions in the persistence adapter with retries. We don't really need the retries, but we need to wrap the db-specific errors with general dynflow persistence errors which the rest of the codebase expects.
8f1596d to
248ad99
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as soon as it's green
Includes #462
Essentially it means wrapping all db interactions in the persistence adapter with retries. We don't really need the retries, but we need to wrap the db-specific errors with general dynflow persistence errors which the rest of the codebase expects.