-
Notifications
You must be signed in to change notification settings - Fork 580
feat: Add model label for vllm backend metrics #2474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tory and DynamoStatLoggerPublisher
WalkthroughIntroduces endpoint label support (notably a “model” label) across runtime, bindings, and examples. Extends WorkerMetricsPublisher::create_endpoint to accept an optional model. Adds Endpoint.add_labels and stored_labels APIs, propagates labels into metrics, and updates VLLM and metrics components to pass model labels during initialization. Changes
Sequence Diagram(s)sequenceDiagram
participant App as App (VLLM/Examples)
participant RT as Runtime Endpoint
participant Pub as WorkerMetricsPublisher
participant Met as Metrics Registry
App->>RT: endpoint("generate")
alt Model provided
App->>RT: add_labels([("model", model)])
end
App->>Pub: create_endpoint(component, model)
Pub->>RT: endpoint("kv_router")
alt Model provided
Pub->>RT: add_labels([("model", model)])
end
Pub->>Met: register metrics with RT.stored_labels()
Met-->>Pub: ok
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (8)
lib/runtime/src/component.rs (2)
435-436
: Doc nit: fix the chaining exampleThe example suggests calling add_labels directly on a Namespace. It should show the full chain to Endpoint.
Apply this diff to clarify:
- /// This allows for method chaining like: runtime.namespace(...).add_labels(...)? + /// This allows for method chaining like: + /// runtime.namespace(...).component(...).endpoint(...).add_labels(...)?
439-474
: Label validation and builder-style return are solid; consider future cross-hierarchy checks
- Duplicate-key detection within input and refusal to overwrite existing endpoint labels are correct and safe.
- Builder-style
self
consumption enables fluent chaining.Optional: If you plan to support labels at Namespace/Component as well, consider whether conflicts should be validated across the entire hierarchy (Namespace, Component, Endpoint) during metrics registration.
Would you like a follow-up PR sketch that aggregates/validates labels across hierarchy before metric registration?
components/metrics/src/main.rs (1)
140-147
: Correctly attaches model label at endpoint selection timeLabeling the endpoint with ("model", model) before using it ensures that metrics registered downstream observe the label. Error propagation via
?
is appropriate.Consider defining a shared constant (e.g.,
const MODEL_LABEL: &str = "model";
) to avoid typos and make the key reusable across the codebase (VLLM publisher, examples, etc.).lib/runtime/examples/hello_world/src/bin/server.rs (1)
76-77
: Nice showcase of per-endpoint model labelingGood example demonstrating how to tag an endpoint with a model label before starting the handler. Consider centralizing "model" as a constant to keep examples consistent and typo-safe.
lib/runtime/examples/hello_world/src/bin/client.rs (1)
35-36
: Client-side labeling is consistent, but optionalAdding the model label on the client Endpoint is fine for symmetry. In most deployments, only the server-side endpoint labels impact the server’s metrics. Keeping it here for demonstration is OK.
lib/llm/src/mocker/engine.rs (1)
192-195
: Avoid hardcoding the model string in create_endpointPassing Some("mock_model".to_string()) works, but consider moving "mock_model" to a module-level constant or deriving it from test config to keep it consistent across mock components.
lib/runtime/examples/system_metrics/src/lib.rs (1)
97-99
: Endpoint labeling is correct; consider propagating labels to custom metrics tooYou add the model label to the endpoint, but MySystemStatsMetrics::from_endpoint currently calls create_intcounter(..., &[]) so this custom metric may miss the endpoint labels. If Endpoint.create_* does not auto-merge stored labels, consider passing stored_labels() for consistency.
For example (outside the changed hunk):
impl MySystemStatsMetrics { pub fn from_endpoint(endpoint: &dynamo_runtime::component::Endpoint) -> anyhow::Result<Self> { let labels = endpoint.stored_labels(); let data_bytes_processed = endpoint.create_intcounter( "my_custom_bytes_processed_total", "Example of a custom metric. Total number of data bytes processed by system handler", &labels, )?; Ok(Self { data_bytes_processed }) } }lib/llm/src/kv_router/publisher.rs (1)
514-519
: Optionally skip empty/whitespace-only model names to avoid meaningless labels.Guard against accidentally setting an empty label value.
Apply:
- let endpoint = { - let mut e = component.endpoint(KV_METRICS_ENDPOINT); - if let Some(model_name) = model { - e = e.add_labels(&[("model", model_name.as_str())])?; - } - e - }; + let endpoint = { + let mut e = component.endpoint(KV_METRICS_ENDPOINT); + if let Some(model_name) = model.filter(|m| !m.trim().is_empty()) { + e = e.add_labels(&[("model", model_name.as_str())])?; + } + e + };
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (15)
components/backends/vllm/src/dynamo/vllm/main.py
(3 hunks)components/backends/vllm/src/dynamo/vllm/publisher.py
(3 hunks)components/metrics/src/main.rs
(1 hunks)lib/bindings/python/rust/lib.rs
(1 hunks)lib/bindings/python/rust/llm/kv.rs
(1 hunks)lib/llm/src/kv_router/publisher.rs
(2 hunks)lib/llm/src/mocker/engine.rs
(1 hunks)lib/runtime/examples/hello_world/src/bin/client.rs
(1 hunks)lib/runtime/examples/hello_world/src/bin/server.rs
(1 hunks)lib/runtime/examples/service_metrics/src/bin/service_client.rs
(1 hunks)lib/runtime/examples/service_metrics/src/bin/service_server.rs
(1 hunks)lib/runtime/examples/system_metrics/src/lib.rs
(1 hunks)lib/runtime/src/component.rs
(1 hunks)lib/runtime/src/metrics.rs
(1 hunks)lib/runtime/src/pipeline/network/ingress/push_handler.rs
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-05T01:04:24.775Z
Learnt from: PeaBrane
PR: ai-dynamo/dynamo#1392
File: launch/dynamo-run/src/subprocess/vllm_v1_inc.py:71-71
Timestamp: 2025-06-05T01:04:24.775Z
Learning: The `create_endpoint` method in `WorkerMetricsPublisher` has backward compatibility maintained through pyo3 signature annotation `#[pyo3(signature = (component, dp_rank = None))]`, making the `dp_rank` parameter optional with a default value of `None`.
Applied to files:
lib/llm/src/kv_router/publisher.rs
lib/llm/src/mocker/engine.rs
components/backends/vllm/src/dynamo/vllm/publisher.py
lib/bindings/python/rust/llm/kv.rs
📚 Learning: 2025-08-09T06:10:00.225Z
Learnt from: tzulingk
PR: ai-dynamo/dynamo#2389
File: components/metrics/src/main.rs:63-66
Timestamp: 2025-08-09T06:10:00.225Z
Learning: In components/metrics/src/main.rs, the Args struct's model_name field should be a required String (not Option<String>) because the metrics aggregator must be tied to a specific model when collecting metrics for a component/endpoint.
Applied to files:
lib/llm/src/kv_router/publisher.rs
lib/llm/src/mocker/engine.rs
lib/bindings/python/rust/llm/kv.rs
🧬 Code Graph Analysis (9)
lib/runtime/src/pipeline/network/ingress/push_handler.rs (2)
lib/bindings/python/rust/lib.rs (1)
endpoint
(473-479)lib/runtime/src/component.rs (1)
endpoint
(226-233)
lib/llm/src/kv_router/publisher.rs (5)
lib/bindings/python/rust/llm/kv.rs (1)
create_endpoint
(59-74)lib/bindings/python/src/dynamo/_core.pyi (4)
create_endpoint
(423-427)component
(187-191)Component
(193-210)endpoint
(206-210)lib/bindings/python/rust/lib.rs (2)
component
(558-564)endpoint
(473-479)lib/runtime/src/component.rs (3)
component
(488-490)component
(635-654)endpoint
(226-233)lib/llm/src/local_model.rs (1)
model_name
(87-90)
lib/bindings/python/rust/lib.rs (3)
lib/runtime/src/component.rs (1)
add_labels
(439-474)lib/bindings/python/src/dynamo/_core.pyi (1)
Endpoint
(212-240)lib/bindings/python/rust/llm/entrypoint.rs (1)
to_pyerr
(251-256)
lib/runtime/src/metrics.rs (2)
lib/runtime/src/component.rs (8)
namespace
(218-220)namespace
(657-664)component
(488-490)component
(635-654)endpoint
(226-233)name
(222-224)name
(484-486)name
(670-675)lib/runtime/src/distributed.rs (1)
namespace
(185-187)
components/backends/vllm/src/dynamo/vllm/publisher.py (3)
lib/bindings/python/src/dynamo/_core.pyi (2)
Component
(193-210)WorkerMetricsPublisher
(411-436)lib/llm/src/kv_router/publisher.rs (1)
create_endpoint
(501-530)lib/bindings/python/rust/llm/kv.rs (1)
create_endpoint
(59-74)
lib/bindings/python/rust/llm/kv.rs (2)
lib/bindings/python/src/dynamo/_core.pyi (3)
component
(187-191)create_endpoint
(423-427)Component
(193-210)lib/llm/src/kv_router/publisher.rs (1)
create_endpoint
(501-530)
lib/runtime/examples/service_metrics/src/bin/service_client.rs (3)
lib/runtime/src/component.rs (3)
client
(565-571)component
(488-490)component
(635-654)lib/runtime/src/transports/nats.rs (1)
client
(68-70)lib/runtime/tests/soak.rs (1)
client
(146-209)
components/backends/vllm/src/dynamo/vllm/main.py (4)
lib/bindings/python/rust/lib.rs (3)
component
(558-564)endpoint
(473-479)add_labels
(542-553)lib/runtime/src/component.rs (4)
component
(488-490)component
(635-654)endpoint
(226-233)add_labels
(439-474)lib/bindings/python/src/dynamo/_core.pyi (2)
component
(187-191)endpoint
(206-210)components/backends/vllm/src/dynamo/vllm/publisher.py (1)
StatLoggerFactory
(132-165)
lib/runtime/src/component.rs (1)
lib/bindings/python/rust/lib.rs (2)
v
(680-680)add_labels
(542-553)
🪛 GitHub Actions: Rust pre-merge checks
lib/bindings/python/rust/lib.rs
[error] 536-536: Cargo fmt -- --check failed. Code is not properly formatted in lib/bindings/python/rust/lib.rs. Run 'cargo fmt' to fix formatting.
🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/2474/merge) by tzulingk.
lib/bindings/python/rust/lib.rs
[error] 1-1: Trailing whitespace detected. Changes auto-fixed by the trailing-whitespace hook.
[error] 1-1: Ruff linting failed and auto-fixed issues in this file during pre-commit.
components/backends/vllm/src/dynamo/vllm/publisher.py
[error] 1-1: Ruff linting failed and auto-fixed issues in this file during pre-commit.
🪛 Ruff (0.12.2)
components/backends/vllm/src/dynamo/vllm/publisher.py
4-4: optparse.Option
imported but unused
Remove unused import: optparse.Option
(F401)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Mirror Repository to GitLab
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (12)
lib/bindings/python/rust/lib.rs (1)
539-553
: API addition looks good and preserves event_loop correctlyAccepts List[Tuple[str, str]], converts to (&str, &str) pairs, delegates to inner Endpoint.add_labels, and returns a new Endpoint reusing the same event_loop. This is the right shape for chaining from Python.
lib/runtime/src/component.rs (1)
421-427
: stored_labels: simple, zero-copy view is correctReturning Vec<(&str, &str)> from internal String storage is appropriate for passing to metrics constructors without cloning. Good use of iterators.
lib/runtime/examples/service_metrics/src/bin/service_server.rs (1)
78-79
: Model label at endpoint registration looks correctAdding the model label before endpoint_builder ensures metrics registered for this endpoint will include the label. Nice.
lib/runtime/src/pipeline/network/ingress/push_handler.rs (1)
58-96
: Propagating endpoint labels into all handler metrics is the right fix; watch for label-name clashesUsing endpoint.stored_labels() and passing them to all metric constructors will correctly attach constant labels (e.g., model) at registration time.
One caveat: ensure Endpoint.add_labels prevents or rejects keys that conflict with variable label names of a metric (e.g., "error_type" for errors_total). A collision there would cause registration failures or inconsistent series.
If not already enforced, consider adding validation to Endpoint.add_labels to disallow conflicts with reserved/variable label names for metrics created via this API.
components/backends/vllm/src/dynamo/vllm/main.py (3)
138-143
: Good: label endpoints with model before serving prefill routesAttaching [("model", config.model)] at endpoint construction ensures metrics and routing pick up the model label from the start.
175-180
: Good: label endpoints with model before serving decode routesMirrors the prefill change and aligns with the metrics pipeline expecting labels at registration time.
189-191
: Passing model into StatLoggerFactory addresses first-registration gapProviding config.model to StatLoggerFactory ensures the initial registration of backend metrics includes the model label, preventing duplicate registrations with and without the label.
lib/bindings/python/rust/llm/kv.rs (1)
58-64
: Good: model plumbed through PyO3 with backward-compatible signature. Also pass-through to Rust publisher looks correct.The optional model parameter in the PyO3 signature aligns with the Rust API change and preserves back-compat via a default None. Forwarding the argument in the async block is correct.
Also applies to: 69-70
lib/runtime/src/metrics.rs (1)
1309-1350
: LGTM: test asserts end-to-end label propagation via Endpoint.add_labels + stored_labels.The test validates auto-labels and custom labels in Prometheus output and exercises the exact path we depend on. Solid coverage.
lib/runtime/examples/service_metrics/src/bin/service_client.rs (1)
36-40
: LGTM: model label applied at endpoint before client creation.This ensures metrics created later can inherit the model label. No conflicts with auto-labels.
lib/llm/src/kv_router/publisher.rs (1)
501-521
: LGTM: optional model label added at endpoint construction time.Registering the endpoint with the model label before starting the handler satisfies the PR objective for labeled metrics. Error propagation with ? is correct.
components/backends/vllm/src/dynamo/vllm/publisher.py (1)
135-142
: LGTM: factory plumbs model through; defaults maintain back-compat.The model is stored on the factory and forwarded to the publisher; dp_rank logic remains intact.
Also applies to: 146-149
Co-authored-by: Neelay Shah <[email protected]> Signed-off-by: Tzu-Ling Kan <[email protected]>
…run, manuallly format it.
Co-authored-by: Keiven Chang <[email protected]> Signed-off-by: Hannah Zhang <[email protected]>
Co-authored-by: Keiven Chang <[email protected]>
Co-authored-by: Keiven Chang <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]>
Overview:
Implement add_labels for Endpoint and add model name to StatLoggerFactory and DynamoStatLoggerPublisher.
Details:
Simply adding the model labels to the Endpoint doesn’t make the metrics include the model label.
This happens because the code registers the same metric multiple times. The first registration doesn’t have the model label, and by the time we add the model label to the Endpoint through add_labels, it’s already too late.
The first time is added through
The second time is added through
So we need to pass the model name into StatLoggerFactory and DynamoStatLoggerPublisher
Where should the reviewer start?
components/backends/vllm/src/dynamo/vllm/publisher.py: Add model name to DynamoStatLoggerPublisher and StatLoggerFactory
lib/runtime/src/component.rs: Implement add_labels for Endpoint.
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
DIS-444 The model label is missing from the backend metrics
Example
Summary by CodeRabbit
New Features
Tests