Skip to content
This repository was archived by the owner on May 21, 2024. It is now read-only.
This repository was archived by the owner on May 21, 2024. It is now read-only.

SecondaryTcpServer hangs when shutting down #1803

@WeekiatAngMotional

Description

@WeekiatAngMotional

Hi,

We are running aktualizr as part of a systemd service, and when a secondary ECU is tasked to shut down, the network.service service is first brought down, followed by the aktualizr service. When the aktualizr process receives the terminate signal, we call SecondaryTcpServer's stop() which starts a local connection to unblock the accept call in the run() function:

void SecondaryTcpServer::stop() {
  LOG_DEBUG << "Stopping Secondary TCP server...";
  keep_running_.store(false);
  // unblock accept
  ConnectionSocket("localhost", listen_socket_.port()).connect();
}

Due to the network service being down, the accept() call in the SecondaryTcpServer::run() function never gets unblocked, which causes systemd to forcefully terminate the process after a long wait time:

systemd[1]: Stopped Network Name Resolution.
systemd[1]: network.service: Deactivated successfully.
systemd[1]: Stopped Network Connectivity.
systemd[1]: aktualizr.service: State 'final-sigterm' timed out. Killing.
systemd[1]: aktualizr.service: Killing process 713 (aktualizr-agen) wit.
systemd[1]: aktualizr.service: Failed with result 'timeout'.
systemd[1]: aktualizr.service: Unit process 713 (aktualizr-agen) remain.
systemd[1]: Stopped aktualizr pipeline.
systemd[1]: Stopped target Basic System.
systemd[1]: Stopped target Path Units.

A fix was done on our end via changing the server socket to be non-blocking such that we poll accept() calls instead, but we would like to mainly know if this is an issue from our issue usage of aktualizr, if we are shutting down services in a different order than aktualizr's guidelines, or if there are working tests run on machines where the network is brought down first.

Here's a successful shut down for reference:

systemd[1]: Stopped Network Name Resolution.
systemd[1]: network.service: Deactivated successfully.
systemd[1]: Stopped Network Connectivity.
systemd[1]: aktualizr.service: Deactivated successfully.
systemd[1]: Stopped aktualizr pipeline.
systemd[1]: Stopped target Basic System.
systemd[1]: Stopped target Path Units.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions