feat: Add PoC of sglang metrics server (default port 30000) #3407

rmccorm4 · 2025-10-03T17:25:28Z

Overview:

@keivenchang is looking at exposing full set of SGLang metrics when running through dynamo (python -m dynamo.sglang ...), and ideally without having to 1:1 map and redefine every single metric SGLang has, and constantly maintain/update everytime a new metric is added.

This exposes the built-in sglang metrics server when running dynamo+sglang.

Details:

Build with these local changes

pushd lib/bindings/python
maturin develop --uv

popd
uv pip install .[sglang]

Run with these changes

python -m dynamo.frontend &

# KEY: Set --enable-metrics
python -m dynamo.sglang --model-path Qwen/Qwen3-0.6B --enable-metrics &

# See server come up in sglang worker log output
# 2025-10-03T17:19:07.084539Z  INFO utils.launch_dummy_health_check_server: Dummy health check server scheduled on existing loop at 127.0.0.1:30000   

# Send inference request
curl localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '
{
  "model": "Qwen/Qwen3-0.6B",
  "messages": [{"role": "user", "content": "Write me a DND campaign"}],
  "stream": true,
  "max_tokens": 2,
  "ignore_eos": true
}'

# Check metrics
curl localhost:30000/metrics

NOTE: The metrics server output seems to be empty until a single inference request has been received.

Summary by CodeRabbit

New Features
- Automatically starts a lightweight health check server when metrics are enabled.
- Uses the configured host and port from existing settings for the health check server.

coderabbitai · 2025-10-03T17:29:00Z

Walkthrough

Adds a conditional startup of a dummy health-check server in sglang runtime initialization when metrics are enabled, passing host, port, and enable_metrics from server_args. No other public interfaces or exports are modified.

Changes

Cohort / File(s)	Summary
SGLang runtime health check integration `components/src/dynamo/sglang/main.py`	On init, if `server_args.enable_metrics` is true, calls `launch_dummy_health_check_server(host, port, enable_metrics)` to start a dummy health-check server.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Caller as Runtime.init()
    participant Config as server_args
    participant HC as DummyHealthCheckServer

    Caller->>Config: Read enable_metrics, host, port
    alt enable_metrics == true
        Caller->>HC: launch_dummy_health_check_server(host, port, enable_metrics)
        note right of HC: Health-check server starts
    else
        note over Caller,Config: No health-check server started
    end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my ears at startup’s chime,
A tiny server wakes in time—
If metrics sing, I hop to check,
A heartbeat thumps on port and spec.
When silence falls, I softly stay,
Nose to wind, and bound away.

Pre-merge checks

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description follows the template for Overview and Details but omits the required sections “Where should the reviewer start?” and “Related Issues,” so it does not fully conform to the repository’s description template.	Please add a “Where should the reviewer start?” section that points to the key files changed and include a “Related Issues” section listing any linked issue numbers or action keywords.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly describes the introduction of a proof-of-concept SGLang metrics server and specifies the default port, clearly reflecting the core change in the pull request.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

components/src/dynamo/sglang/main.py (1)
68-71: Add error handling for health check server launch.

The health check server is launched without any error handling. If the server fails to start (e.g., port already in use), it could cause silent failures or unexpected behavior.

Consider wrapping the call in a try-except block:
 if server_args.enable_metrics:
-    launch_dummy_health_check_server(
-        server_args.host, server_args.port, server_args.enable_metrics
-    )
+    try:
+        launch_dummy_health_check_server(
+            server_args.host, server_args.port, server_args.enable_metrics
+        )
+        logging.info(f"Metrics server started on {server_args.host}:{server_args.port}")
+    except Exception as e:
+        logging.warning(f"Failed to start metrics server: {e}")

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1faf015 and 2678f28.

📒 Files selected for processing (1)

components/src/dynamo/sglang/main.py (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: trtllm
GitHub Check: vllm
GitHub Check: sglang
GitHub Check: Build and Test - dynamo

🔇 Additional comments (2)

components/src/dynamo/sglang/main.py (2)

12-12: Verify import of launch_dummy_health_check_server
Confirm that launch_dummy_health_check_server is defined in sglang.srt.utils in the SGLang package your project uses and that it accepts (host, port, enable_metrics) as arguments.

68-71: Ensure metrics server is launched for all SGLang engine initializations or confirm scoped exposure
Metrics startup is only in init (main.py:64–71). These functions also create sgl.Engine but omit metrics:

init_prefill (main.py:160–164)

init_multimodal_worker (main.py:305–316)

init_multimodal_prefill_worker (main.py:356–360)
If metrics should cover all engine instances, add the same if server_args.enable_metrics block to each; otherwise confirm that only the main decode worker requires metrics.

coderabbitai · 2025-10-03T17:29:04Z

components/src/dynamo/sglang/main.py

+    if server_args.enable_metrics:
+        launch_dummy_health_check_server(
+            server_args.host, server_args.port, server_args.enable_metrics
+        )


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Verify the port parameter matches the intended metrics port.

According to the PR description, the metrics server should run on port 30000 by default. However, this code passes server_args.port, which is likely the main SGLang server port (typically 8000 or similar).

Please confirm whether:

server_args.port contains the correct metrics port (30000), or

A separate metrics port configuration should be used (e.g., server_args.metrics_port)

Run the following script to check how server_args.port is configured and if there's a separate metrics port:

🏁 Script executed:

#!/bin/bash # Description: Check server_args configuration and port usage # Find where server_args is defined and what ports are available ast-grep --pattern 'class $_ { $$$ port = $$$ $$$ }' # Search for metrics_port or similar configuration rg -nP '(metrics_port|port.*30000|30000)' --type=py -C3 # Check the args parsing to see available port configurations rg -nP 'def\s+parse_args' --type=py -A 50

Length of output: 47344

Introduce a dedicated --metrics-port CLI option

The metrics server currently binds to server_args.port (the main API port), causing conflicts. Add a --metrics-port flag (default 30000) in ServerArgs.add_cli_args (components/src/dynamo/sglang/args.py) and update the launch_dummy_health_check_server call in components/src/dynamo/sglang/main.py (around lines 68–71) to use server_args.metrics_port.

🤖 Prompt for AI Agents

In components/src/dynamo/sglang/main.py around lines 68 to 71, the metrics server is being started using server_args.port which conflicts with the main API port; update the code to call launch_dummy_health_check_server(server_args.host, server_args.metrics_port, server_args.enable_metrics) and add a new CLI flag in components/src/dynamo/sglang/args.py by introducing a --metrics-port argument on ServerArgs.add_cli_args with a default of 30000 (and parse/store it as metrics_port on ServerArgs); ensure any type parsing matches other port args (int) and update any help text accordingly.

rmccorm4 · 2025-10-03T17:33:01Z

components/src/dynamo/sglang/main.py

    server_args, dynamo_args = config.server_args, config.dynamo_args

    engine = sgl.Engine(server_args=server_args)
+    if server_args.enable_metrics:


Some things to consider either now or later:

Do we want this health check server up always even for non-metrics purposes? Then we can remove the if server_args.enable_metrics:

Do we want metrics server up always / by default?

If so, we can default server_args.enable_metrics = True in our worker code

If not, we can also consider the worker-specific env vars that toggle metrics today like DYN_SYSTEM_ENABLED=true

However, our current UX proposition is to match sglang as closely as possible on CLI commands for seamless transition - so toggling sglang engine metrics with a unique dynamo env var here seems like an anti-pattern to me.

in the future when we expose metrics via rust endpoint - do we need the dummy health check any more?

I would use the same flag as sglang - but ideally it would be surfaced via our endpoint and not need a seperate server - but that is dependent on wait we find from @keivenchang 's work

feat: Add PoC of sglang metrics server (default port 30000)

2678f28

rmccorm4 requested review from a team as code owners October 3, 2025 17:25

pull-request-size bot added the size/XS label Oct 3, 2025

github-actions bot added the feat label Oct 3, 2025

rmccorm4 requested review from ishandhanani, keivenchang and nnshah1 October 3, 2025 17:25

coderabbitai bot reviewed Oct 3, 2025

View reviewed changes

rmccorm4 commented Oct 3, 2025

View reviewed changes

nnshah1 approved these changes Oct 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add PoC of sglang metrics server (default port 30000) #3407

feat: Add PoC of sglang metrics server (default port 30000) #3407

rmccorm4 commented Oct 3, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 3, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 3, 2025

Uh oh!

rmccorm4 Oct 3, 2025

Uh oh!

nnshah1 Oct 3, 2025

Uh oh!

nnshah1 Oct 3, 2025

Uh oh!

Uh oh!

feat: Add PoC of sglang metrics server (default port 30000) #3407

Are you sure you want to change the base?

feat: Add PoC of sglang metrics server (default port 30000) #3407

Conversation

rmccorm4 commented Oct 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 3, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

rmccorm4 Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

nnshah1 Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

nnshah1 Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rmccorm4 commented Oct 3, 2025 •

edited by coderabbitai bot

Loading