fix: handle_refresh to handle potential errors in OS monitoring data retrieval #475

yordis · 2025-05-19T07:48:25Z

Fixing the following:

Elixir.KeyError: key :system_mem not found in: {:badrpc, {:EXIT, :terminating}}

If you are using the dot syntax, such as map.field, make sure the left-hand side of the dot is a map
    (phoenix_live_dashboard 0.8.6) lib/phoenix/live_dashboard/pages/os_mon_page.ex:39: Phoenix.LiveDashboard.OSMonPage.handle_refresh/1
    (phoenix_live_dashboard 0.8.6) lib/phoenix/live_dashboard/pages/os_mon_page.ex:28: Phoenix.LiveDashboard.OSMonPage.mount/3
    (phoenix_live_view 1.0.10) lib/phoenix_live_view/utils.ex:348: anonymous fn/6 in Phoenix.LiveView.Utils.maybe_call_live_view_mount!/5
    (telemetry 1.3.0) /opt/umbrella/deps/telemetry/src/telemetry.erl:324: :telemetry.span/3
    (phoenix_live_view 1.0.10) lib/phoenix_live_view/channel.ex:1207: Phoenix.LiveView.Channel.verified_mount/8
    (phoenix_live_view 1.0.10) lib/phoenix_live_view/channel.ex:84: Phoenix.LiveView.Channel.handle_info/2
    (stdlib 6.2) gen_server.erl:2345: :gen_server.try_handle_info/3
    (stdlib 6.2) gen_server.erl:2433: :gen_server.handle_msg/6

…retrieval Example: key :system_mem not found in: {:badrpc, {:EXIT, :terminating}}

yordis · 2025-05-19T07:51:04Z

I notice that defp calculate_cpu_data(_), do: nil is a fallback for calculate_cpu_data/1, I am unsure if you prefer that I add fallbacks to the other ones.

I prefer this way so downstream dont worry about it. Please let me know.

yordis · 2025-05-19T08:02:12Z

I think that it got triggered because I had some dashboards opened and I deployed

josevalim · 2025-05-19T08:16:56Z

@yordis thank you for bringing this up. I noticed this code is using the deprecated :rpc module, so I sent a PR to update it to :erpc, which will raise if the rpc request fails: #476

However, you are right that this will happen in case of deployments and we need to discuss if we should fix this or not:

If we don't fix this, you will get exceptions in your logging system
However, if we fix it by rescueing exceptions, we will probably remain watching the page for a node that no longer exists
There are other handle_refresh actions that will have the same issue

So our alternatives are:

Don't fix it. Let it crash, LiveView will reconnect, and probably redirect to the current node
Once Use erpc #476 is merged, the handle_refresh will now crash, so we can add a try/catch in the handle_refresh and have it show an error/flash message (instead of silently failing)
Change the code to check if the node still exists before calling the handle_refresh callback, if it doesn't exist, we redirect

Any preferences?

yordis · 2025-05-19T17:18:06Z

If we don't fix this, you will get exceptions in your logging system

That was my situation 😄 My goal is to keep it silence.

Don't fix it. Let it crash, LiveView will reconnect, and probably redirect to the current node

Personally, as long as it doesn't cause an exception in the error reporting system, I am OK with it and prefer this option.

Change the code to check if the node still exists before calling the handle_refresh callback, if it doesn't exist, we redirect

Wouldn't there be a race condition still? By the time I call handle_refresh it may not be there, no?

josevalim · 2025-05-19T19:22:35Z

Right, that's the part I am not sure. It is an exception and it is unexpected, if we blindly rescued, it would hide actual bugs too.

So even if we rescued it in handle_refresh, I would still log it, but perhaps it would be easier for you to silence this particular log.

yordis · 2025-05-19T19:56:04Z

Navigate away to another node and a flash message? We could rescue very specific exception only.

yordis · 2025-05-22T20:59:10Z

@josevalim should I move forward with the previous message?

josevalim · 2025-05-23T09:44:39Z

I am not sure there is a conclusion here. There isn't a single exceptions for us to rescue and if we rescue everything, we could hide actual exceptions. At best we could capture the refresh exception and call Logger ourselves and then ask to reload the page. That's the same behaviour we have today but, if we are the ones calling Logger, you can then use Logger filters to remove it out from your logs, either at compile-time or runtime.

fix: handle_refresh to handle potential errors in OS monitoring data …

cf69b95

…retrieval Example: key :system_mem not found in: {:badrpc, {:EXIT, :terminating}}

yordis marked this pull request as ready for review May 19, 2025 07:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle_refresh to handle potential errors in OS monitoring data retrieval #475

fix: handle_refresh to handle potential errors in OS monitoring data retrieval #475

Uh oh!

yordis commented May 19, 2025 •

edited

Loading

Uh oh!

yordis commented May 19, 2025

Uh oh!

yordis commented May 19, 2025

Uh oh!

josevalim commented May 19, 2025 •

edited

Loading

Uh oh!

yordis commented May 19, 2025

Uh oh!

josevalim commented May 19, 2025

Uh oh!

yordis commented May 19, 2025

Uh oh!

yordis commented May 22, 2025

Uh oh!

josevalim commented May 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: handle_refresh to handle potential errors in OS monitoring data retrieval #475

Are you sure you want to change the base?

fix: handle_refresh to handle potential errors in OS monitoring data retrieval #475

Uh oh!

Conversation

yordis commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yordis commented May 19, 2025

Uh oh!

yordis commented May 19, 2025

Uh oh!

josevalim commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yordis commented May 19, 2025

Uh oh!

josevalim commented May 19, 2025

Uh oh!

yordis commented May 19, 2025

Uh oh!

yordis commented May 22, 2025

Uh oh!

josevalim commented May 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yordis commented May 19, 2025 •

edited

Loading

josevalim commented May 19, 2025 •

edited

Loading