Skip to content

feat(ethexe-rpc): Metrics middleware for RPC methods#5371

Open
ecol-master wants to merge 94 commits intomasterfrom
feat/rpc-metrics-middleware
Open

feat(ethexe-rpc): Metrics middleware for RPC methods#5371
ecol-master wants to merge 94 commits intomasterfrom
feat/rpc-metrics-middleware

Conversation

@ecol-master
Copy link
Copy Markdown
Member

@ecol-master ecol-master commented Apr 21, 2026

Closes: #5387

@gear-tech/dev

Dmitry Kuzmin and others added 30 commits February 6, 2026 13:47
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive set of changes primarily focused on enhancing the RPC layer with metrics and significantly refactoring the injected transaction and promise handling. The core intent is to improve the efficiency and observability of RPC communications, particularly for transaction promises, by introducing compact promise structures and a dedicated management system. These changes streamline the data flow between RPC, compute, consensus, and network services, ensuring more efficient promise propagation and better resource utilization.

Highlights

  • RPC Metrics Middleware: Implemented a generic metrics middleware for RPC methods, tracking call counts, latency, and in-flight requests to enhance monitoring and performance analysis.
  • Injected RPC API Overhaul: The injected RPC API was extensively refactored to introduce a more robust and efficient promise handling mechanism, including new modules for promise management, transaction relaying, and asynchronous subscriber spawning.
  • Introduction of Compact Promises: Implemented SignedCompactPromise to optimize network traffic by transmitting only promise signatures, while full promise bodies are managed locally within the RPC service and database.
  • Configurable Promise Emission: The compute service now includes a promises_mode configuration, allowing nodes to control whether promises are always emitted or driven by consensus, providing greater flexibility.
  • Dependency Updates: Updated various Rust dependencies, including the addition of bon for builder patterns, thiserror for error handling, and scopeguard for resource management, alongside several version bumps for existing crates.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a more robust and centralized metrics collection system for the RPC server, replacing manual increments with a middleware-based approach. The new RpcMetricsLayer and RpcMetricsRegistry allow for automated tracking of call counts, latency, and in-flight requests for specified RPC methods. Feedback focuses on preventing potential metric leaks during future cancellation, resolving module naming conflicts, and expanding the default set of tracked methods to ensure comprehensive API monitoring.

Comment thread ethexe/rpc/src/metrics.rs Outdated
Comment thread ethexe/rpc/src/metrics.rs
Comment thread ethexe/rpc/src/metrics.rs Outdated
@ecol-master ecol-master linked an issue Apr 28, 2026 that may be closed by this pull request
@ecol-master ecol-master changed the base branch from kuzmindev/feat/producer-send-promises-hashes to master May 4, 2026 15:45
@ecol-master
Copy link
Copy Markdown
Member Author

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 4, 2026

Claude finished @ecol-master's task in 3m 48s —— View job


Code Review

This PR replaces per-method manual counters with a generic Tower middleware layer that intercepts all jsonrpsee method calls and records calls_started_total, calls_finished_total (with status label), call_duration_seconds, and calls_in_flight for configured methods. The design is clean and the integration with jsonrpsee's RpcServiceBuilder is correct.

high: calls_in_flight gauge leak on future cancellation

File: ethexe/rpc/src/metrics.rs lines 85-92

If the BoxFuture returned from call() is dropped before resolving (server shutdown, connection close), on_outgoing_response never runs and calls_in_flight stays permanently incremented. spawner.rs already demonstrates the correct fix with scopeguard::guard, and scopeguard is already a dependency of ethexe-rpc.

Fix: wrap the decrement in a RAII guard rather than relying on on_outgoing_response:

let future = self.service.call(request);
Box::pin(async move {
    metrics.on_incoming_request();
    // Guard ensures decrement even on cancellation.
    let _guard = scopeguard::guard((), |_| metrics.calls_in_flight.decrement(1));
    let started_at = Instant::now();
    let response = future.await;
    metrics.on_outgoing_response(started_at, &response); // remove calls_in_flight decrement from here
    response
})

medium: inner mod metrics shadows the metrics crate

File: ethexe/rpc/src/metrics.rs line 99

The inner module name conflicts with the metrics crate, requiring #[allow(clippy::module_inception)] and the super::metrics disambiguation on line 25. Renaming to mod types or mod definitions removes the ambiguity without touching anything else.

medium: TRACKED_METHODS missing the read-only injected methods

File: ethexe/rpc/src/metrics.rs lines 35-39

injected_getTransactionPromise and injected_getTransactions are absent. The same rationale for tracking program_calculateReplyForHandle latency applies to getTransactions (batch capped at 100 IDs) and getTransactionPromise.

low: inconsistent tracing style in server.rs

File: ethexe/rpc/src/apis/injected/server.rs line 186

trace is imported at line 40, but line 186 uses tracing::trace!. Inconsistent with the relay.rs cleanup in this same PR.


Overall the middleware approach is the right design. The cancellation-safety gap with calls_in_flight is the main thing to address before merging.

@ecol-master
Copy link
Copy Markdown
Member Author

@claude review-delta

@claude
Copy link
Copy Markdown

claude Bot commented May 4, 2026

Claude finished @ecol-master's task in 1m 56s —— View job


Delta Review

Reviewing changes since the previous review. The relevant commit is acc7d48 ("fix: claude review").


Fixed

  • High: calls_in_flight gauge leak — correctly addressed with scopeguard::guard in metrics.rs:88-89. The decrement now runs on drop, so cancellation no longer leaks the gauge.

Still Open

medium: inner mod metrics shadows the metrics crate (metrics.rs:110)

The #[allow(clippy::module_inception)] suppression is still present. The inner module is still named metrics, shadowing the metrics crate name. Renaming to mod types or mod definitions eliminates the suppression and the ambiguity. Fix this →

medium: TRACKED_METHODS missing read-only injected methods (metrics.rs:35-39)

injected_getTransactionPromise and injected_getTransactions are still absent from the tracked set. Both methods are part of the public injected API surface and are worth monitoring — getTransactions already has a batch-size cap (100 IDs) that makes latency spikes interesting to observe. Fix this →

low: inconsistent tracing style in server.rs:186

tracing::trace! is used at line 186 while trace is already imported at line 40. All other call sites use the short form. Fix this →


New observation

The acc7d48 refactor removed on_incoming_request()/on_outgoing_response() and made all MethodMetrics fields pub. Combined with pub use metrics::* at metrics.rs:21, MethodMetrics and its raw Counter/Gauge/Histogram fields are now part of the crate's public API. This is harmless since callers can only call metrics-crate operations on those handles, but if encapsulation matters here, keeping the helper methods and private fields would be the cleaner interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ethexe-rpc: Implement metrics middleware for RPC

2 participants