Skip to content

Commit 47fcf1b

Browse files
committed
Fix: libcrmservices: Add another place to finalize Start/Stop ops.
The comment should explain this pretty well. What was happening was that an executor regression test was failing because the operation never gets finalized. Essentially, services__finalize_async_op never gets called for a unit that doesn't exist. We typically want this to happen as part of the systemd callback that's registered in the executor, but this callback will never get called in this case because there's no systemd message that will make it through our filter and trigger the callback. JobRemoved won't get sent because there's no job to remove. It seems like we could look for a failure from LoadUnit for units that don't exist, but pacemaker always writes an override unit file which means the unit will always exist.
1 parent cc89994 commit 47fcf1b

File tree

1 file changed

+14
-5
lines changed

1 file changed

+14
-5
lines changed

lib/services/systemd.c

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -901,14 +901,23 @@ unit_method_complete(DBusPendingCall *pending, void *user_data)
901901
CRM_LOG_ASSERT(pending == op->opaque->pending);
902902
services_set_op_pending(op, NULL);
903903

904-
/* Determine result and finalize action. Start actions must be finalized
905-
* by the registered systemd callback (for instance, handle_systemd_job_complete
906-
* in the executor) to allow for some place to react to the finished
907-
* action before finalizing it and cleaning it up.
904+
/* Determine result and finalize action.
905+
*
906+
* Start and stop actions must be finalized by the registerd systemd callback
907+
* (for instance, handle_systemd_job_complete in the executor) to allow for
908+
* some place to react to the finished action before finalizing it and cleaning
909+
* it up.
910+
*
911+
* HOWEVER, if the operation failed with NOT_INSTALLED, we need to finalize it
912+
* right now too. We'll never receive a JobRemoved() signal for that case, so
913+
* the callback will never be called. Additionally since we set up an override
914+
* file, LoadUnit will always succeed so we can't finalize based on that
915+
* failing. We have to do this here.
908916
*/
909917
process_unit_method_reply(reply, op);
910918

911-
if (!pcmk__strcase_any_of(op->action, PCMK_ACTION_START, PCMK_ACTION_STOP, NULL)) {
919+
if (!pcmk__strcase_any_of(op->action, PCMK_ACTION_START, PCMK_ACTION_STOP, NULL) ||
920+
op->status == PCMK_EXEC_NOT_INSTALLED) {
912921
services__finalize_async_op(op);
913922
}
914923

0 commit comments

Comments
 (0)