Skip to content

Conversation

santigimeno
Copy link
Member

@santigimeno santigimeno commented Sep 29, 2025

Avoid unsigned underflow when loop_diff is slightly larger than idle_diff due to clock skew on idle loops.

Fixes: https://github.com/nodesource/nsolid-private/issues/15

Summary by CodeRabbit

  • Bug Fixes
    • Clamped event-loop and idle time calculations to prevent unsigned underflow.
    • Corrected idle-vs-loop comparisons to avoid negative diffs producing bogus utilization.
    • Reduced spurious spikes in reported event-loop usage for more reliable monitoring.

@santigimeno santigimeno self-assigned this Sep 29, 2025
Copy link

coderabbitai bot commented Sep 29, 2025

Walkthrough

Adds defensive clamping in event-loop metric calculations: clamp loop_idle_time to loop_duration in uv_metrics_cb_, and clamp idle_diff to loop_diff in get_event_loop_stats_ to prevent unsigned underflow and incorrect utilization values.

Changes

Cohort / File(s) Summary of changes
Event-loop metric clamping
src/nsolid/nsolid_api.cc
- In uv_metrics_cb_: if loop_idle_time > loop_duration, set loop_idle_time = loop_duration to prevent underflow in later proc time calculations.
- In get_event_loop_stats_: if idle_diff > loop_diff, set idle_diff = loop_diff to avoid unsigned underflow and bogus utilization values.

Sequence Diagram(s)

sequenceDiagram
    participant UV as uv_metrics_cb_
    participant Stats as get_event_loop_stats_
    participant State as loop vars/times

    UV->>State: read loop_idle_time, loop_duration
    alt loop_idle_time > loop_duration
        UV->>State: loop_idle_time = loop_duration
        Note right of State: clamp to prevent unsigned underflow
    end

    Stats->>State: read loop_diff, idle_diff
    alt idle_diff > loop_diff
        Stats->>State: idle_diff = loop_diff
        Note right of State: clamp to avoid bogus utilization
    end
    Stats->>State: compute proc_diff and utilization metrics
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

  • juanarbol

Poem

I’m a rabbit in C, quick to mend a small blip,
I clipped idle tails so metrics won't slip.
No underflow tunnels, no numbers gone bad,
A tiny safe hop keeps the stats calm and glad. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "src: clamp idle_diff before computing proc_diff" is fully related to the main changes in the changeset. The raw summary indicates that clamping logic was added in two locations (uv_metrics_cb_ and get_event_loop_stats_) to prevent idle_diff from exceeding loop_diff, which directly aligns with the title's description. The title is concise, specific, and clearly communicates the primary technical solution being implemented to address unsigned underflow issues in loop timing calculations. A developer scanning the commit history would quickly understand that this change involves adding clamping logic for idle_diff calculations.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch santi/clamp_loop_delay

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 890194b and a4e371e.

📒 Files selected for processing (1)
  • src/nsolid/nsolid_api.cc (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/nsolid/nsolid_api.cc

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@juanarbol juanarbol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@santigimeno santigimeno requested a review from EHortua October 7, 2025 15:38
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13eb35c and 5666509.

📒 Files selected for processing (1)
  • src/nsolid/nsolid_api.cc (2 hunks)
🔇 Additional comments (1)
src/nsolid/nsolid_api.cc (1)

2165-2171: LGTM! Clamping prevents unsigned underflow.

The clamping logic correctly handles the clock skew issue where idle_diff can be slightly larger than loop_diff on some platforms. This prevents unsigned underflow in the proc_diff calculation on line 2171 and eliminates the bogus utilization values reported in issue #15.

Comment on lines 2202 to 2219
// Check current_hrtime_ >= exit_time to avoid unsigned underflow when
// exit_time is slightly larger due to clock skew (observed on Windows).
// Without this check, the subtraction can produce bogus values in the
// millions of seconds range.
fprintf(stderr, "[DEBUG] exit_time=%llu, prev_call_time=%llu, current_hrtime=%llu, rolling_est_lag=%f\n",
(unsigned long long)exit_time,
(unsigned long long)stor->prev_call_time_,
(unsigned long long)stor->current_hrtime_,
envinst->rolling_est_lag_);
stor->loop_estimated_lag =
(exit_time > 0 && stor->prev_call_time_ > exit_time) ?
(exit_time > 0 && stor->prev_call_time_ > exit_time &&
stor->current_hrtime_ >= exit_time) ?
(stor->current_hrtime_ - exit_time) / 1e6 : envinst->rolling_est_lag_;
fprintf(stderr, "[DEBUG] loop_estimated_lag=%f (condition: exit_time>0=%d, prev>exit=%d, current>=exit=%d)\n",
stor->loop_estimated_lag,
exit_time > 0,
stor->prev_call_time_ > exit_time,
stor->current_hrtime_ >= exit_time);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove debug logging before merging.

The additional guard stor->current_hrtime_ >= exit_time correctly prevents unsigned underflow, but the debug fprintf statements on lines 2206-2210 and 2215-2219 must be removed before merging to production:

  • These debug logs will pollute stderr in production environments
  • Performance concern: metrics collection runs frequently (every event loop iteration)
  • Portability: %llu may not be correct for uint64_t on all platforms (consider PRIu64 from <cinttypes>)

Apply this diff to remove the debug logging:

- fprintf(stderr, "[DEBUG] exit_time=%llu, prev_call_time=%llu, current_hrtime=%llu, rolling_est_lag=%f\n",
-         (unsigned long long)exit_time,
-         (unsigned long long)stor->prev_call_time_,
-         (unsigned long long)stor->current_hrtime_,
-         envinst->rolling_est_lag_);
  stor->loop_estimated_lag =
    (exit_time > 0 && stor->prev_call_time_ > exit_time &&
     stor->current_hrtime_ >= exit_time) ?
    (stor->current_hrtime_ - exit_time) / 1e6 : envinst->rolling_est_lag_;
- fprintf(stderr, "[DEBUG] loop_estimated_lag=%f (condition: exit_time>0=%d, prev>exit=%d, current>=exit=%d)\n",
-         stor->loop_estimated_lag,
-         exit_time > 0,
-         stor->prev_call_time_ > exit_time,
-         stor->current_hrtime_ >= exit_time);
  stor->loop_idle_percent = (1 - stor->loop_utilization) * 100;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check current_hrtime_ >= exit_time to avoid unsigned underflow when
// exit_time is slightly larger due to clock skew (observed on Windows).
// Without this check, the subtraction can produce bogus values in the
// millions of seconds range.
fprintf(stderr, "[DEBUG] exit_time=%llu, prev_call_time=%llu, current_hrtime=%llu, rolling_est_lag=%f\n",
(unsigned long long)exit_time,
(unsigned long long)stor->prev_call_time_,
(unsigned long long)stor->current_hrtime_,
envinst->rolling_est_lag_);
stor->loop_estimated_lag =
(exit_time > 0 && stor->prev_call_time_ > exit_time) ?
(exit_time > 0 && stor->prev_call_time_ > exit_time &&
stor->current_hrtime_ >= exit_time) ?
(stor->current_hrtime_ - exit_time) / 1e6 : envinst->rolling_est_lag_;
fprintf(stderr, "[DEBUG] loop_estimated_lag=%f (condition: exit_time>0=%d, prev>exit=%d, current>=exit=%d)\n",
stor->loop_estimated_lag,
exit_time > 0,
stor->prev_call_time_ > exit_time,
stor->current_hrtime_ >= exit_time);
// Check current_hrtime_ >= exit_time to avoid unsigned underflow when
// exit_time is slightly larger due to clock skew (observed on Windows).
// Without this check, the subtraction can produce bogus values in the
// millions of seconds range.
stor->loop_estimated_lag =
(exit_time > 0 && stor->prev_call_time_ > exit_time &&
stor->current_hrtime_ >= exit_time) ?
(stor->current_hrtime_ - exit_time) / 1e6 : envinst->rolling_est_lag_;
stor->loop_idle_percent = (1 - stor->loop_utilization) * 100;
🤖 Prompt for AI Agents
In src/nsolid/nsolid_api.cc around lines 2202 to 2219, remove the two debug
fprintf blocks (the one printing
exit_time/prev_call_time/current_hrtime/rolling_est_lag and the one printing
loop_estimated_lag and the condition flags) so they don't pollute stderr or
impact performance; keep the added guard (stor->current_hrtime_ >= exit_time)
and the loop_estimated_lag computation unchanged, and if any future logging is
needed use a proper logging macro or PRIu64 via <cinttypes> rather than raw
fprintf with %llu.

@santigimeno santigimeno force-pushed the santi/clamp_loop_delay branch from 5666509 to 890194b Compare October 16, 2025 20:46
Clamp idle time values before subtracting from loop duration to
avoid unsigned underflow when clock skew causes idle time to be
slightly larger than total loop time. This manifests on Windows
when the loop is nearly 100% idle.

Applied in two locations:
- uv_metrics_cb_: prevents bogus rolling_est_lag_ values
- get_event_loop_stats_: prevents bogus loop_utilization values

Fixes: nodesource/nsolid-private#15
@santigimeno santigimeno force-pushed the santi/clamp_loop_delay branch from 890194b to a4e371e Compare October 16, 2025 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants