Skip to content

Add logging when (de)activating backend clusters #672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

amybubu
Copy link

@amybubu amybubu commented Apr 15, 2025

Description

As apart of efforts to improve telemetry as described in Issue #649, this PR adds logging for whenever a backend cluster is activated/deactivated.

Testing

  • built and ran locally
  • confirmed logs appeared when calling de(activate) via curl call or changing deactivation status via the UI, e.g.:
2025-04-03T17:09:03.588-0700	INFO	http-worker-87	io.trino.gateway.ha.router.HaGatewayManager	Backend cluster trino-3 activation status changed to active=false (previous status: active=true).
  • confirm error thrown if try to activate/deactivate non-existent backend
java.lang.IllegalStateException: No backend found with name: trino-4, could not activate

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Apr 15, 2025
Copy link
Member

@xkrogen xkrogen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amybubu ! Ensuring we have good logging around this has been very helpful to track down the sequence of events for cluster activation/deactivation

@andythsu
Copy link
Member

LGTM!

@andythsu
Copy link
Member

pinging @vishalya

Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash commits into one and update the PR & commit title.

The current PR title looks misleading. It adds not only logging but also "dao.findFirstByName" call.

logActivationStatusChange(clusterName, newStatus, prevStatus);
}
catch (Exception e) {
log.error("Failed to update activation status for cluster %s: %s", clusterName, e.getMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GatewayResource also logs the error message. I don't think we should log the similar message twice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants