feat: dynamic endpoint registration #3418

ishandhanani · 2025-10-03T22:16:06Z

Overview

Introduced dynamic HTTP endpoints and a Native API handler to enhance the component's functionality and API flexibility.

Changes Made

Added NativeApiHandler class for managing SGLang native API endpoints.
Integrated NativeApiHandler initialization and task management into the main component setup.
Extended the serve_endpoint function to support dynamic http_endpoint_path configuration.
Introduced a new module, dynamic_endpoint.rs, for handling dynamically configured HTTP endpoints.
Updated Instance struct to include HTTP endpoint paths.

Examples

Curling the new dynamic get_model_info

❯ curl -X POST http://localhost:8000/get_model_info
{"responses":[{"data":{"data":[{"model_path":"Qwen/Qwen3-0.6B","tokenizer_path":"Qwen/Qwen3-0.6B","preferred_sampling_params":null,"weight_version":"default"}]}}]}

Curling a non-existant endpoint (404)

❯ curl -s -w "%{http_code}\n" -X POST http://localhost:8000/graham_rust_legend
{"message":"Endpoint not found"}404

Follow up

Implement more endpoints and handle p/d specific implementations as well
Add docs for this

copy-pr-bot · 2025-10-03T22:16:10Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2025-10-03T22:20:29Z

Walkthrough

Adds a new "test" endpoint in sglang main to run alongside the primary endpoint, registers a custom route "/test", and serves it concurrently. Introduces Endpoint.register_custom_endpoint in Rust/Python bindings with path validation and optional etcd registration. Updates two imports for chat_templates to a new module path.

Changes

Cohort / File(s)	Change summary
Test endpoint orchestration `components/src/dynamo/sglang/main.py`	Creates a secondary endpoint named "test", registers the custom route "/test", and serves it concurrently with the main endpoint using the same handler and readiness behavior.
Import path update for chat templates `components/src/dynamo/sglang/request_handlers/multimodal_encode_worker_handler.py`, `components/src/dynamo/sglang/utils/multimodal_chat_processor.py`	Switches imports from `sglang.srt.conversation` to `sglang.srt.parser.conversation`; no logic changes.
Endpoint API: custom endpoint registration (Rust + Python bindings) `lib/bindings/python/rust/lib.rs`, `lib/bindings/python/src/dynamo/_core.pyi`	Adds `Endpoint::register_custom_endpoint(endpoint_path)` with path validation and optional etcd key creation; exposes it to Python in the type stubs.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant App as sglang main.py
    participant EP as Endpoint (Python)
    participant Rust as Endpoint (Rust)
    participant Etcd as etcd (optional)
    participant Srv as Server

    App->>EP: component.endpoint("default")
    App->>EP: component.endpoint("test")
    note right of App: Startup

    App->>EP: test_endpoint.register_custom_endpoint("/test")
    EP->>Rust: register_custom_endpoint("/test")
    alt Valid path
        Rust->>Etcd: Put key for "/test" (if client available)
        Etcd-->>Rust: Ack
        Rust-->>EP: Ok
    else Invalid path
        Rust-->>EP: Error (PyValueError)
    end

    par Serve endpoints
        App->>Srv: serve_endpoint(default, handler.generate)
        App->>Srv: serve_endpoint(test, handler.generate)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I hopped to the edge of the routing glade,
A "/test" trail blazed where none was laid.
Etcd whispered keys in the breeze,
Endpoints twinned like mirrored trees.
With tidy imports trimmed just right—
I thump approval, ears alight. 🐇✨

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description provides helpful background and examples but does not adhere to the repository’s required template structure or heading levels. It uses “## Overview” and “## Changes Made” instead of the expected “#### Overview:” and “#### Details:”, and it omits the “#### Where should the reviewer start?” section. The “Related Issues” section is also not formatted with the proper heading and action keyword as specified in the template.	Please update the description to match the template by using the exact section headings (“#### Overview:”, “#### Details:”, “#### Where should the reviewer start?”, “#### Related Issues:”) and include the linked issue with an appropriate action keyword (e.g., “Closes #3275”).

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title “feat: dynamic endpoint registration” accurately captures the primary change of adding support for registering custom endpoints dynamically and is concise and specific to the main feature introduced in the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c48f49a and b317939.

📒 Files selected for processing (5)

components/src/dynamo/sglang/main.py (2 hunks)
components/src/dynamo/sglang/request_handlers/multimodal_encode_worker_handler.py (1 hunks)
components/src/dynamo/sglang/utils/multimodal_chat_processor.py (1 hunks)
lib/bindings/python/rust/lib.rs (1 hunks)
lib/bindings/python/src/dynamo/_core.pyi (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-21T17:23:02.836Z

Learnt from: michaelfeil
PR: ai-dynamo/dynamo#2591
File: lib/bindings/python/rust/http.rs:0-0
Timestamp: 2025-08-21T17:23:02.836Z
Learning: In lib/bindings/python/rust/http.rs, the enable_endpoint method uses EndpointType::all() to dynamically support all available endpoint types with case-insensitive matching, which is more maintainable than hardcoded match statements for endpoint type mappings.

Applied to files:

lib/bindings/python/rust/lib.rs

🧬 Code graph analysis (3)

lib/bindings/python/src/dynamo/_core.pyi (1)

lib/bindings/python/rust/lib.rs (1)

register_custom_endpoint (648-678)

components/src/dynamo/sglang/main.py (2)

lib/bindings/python/rust/lib.rs (4)

component (763-769)

endpoint (628-634)

register_custom_endpoint (648-678)

serve_endpoint (681-734)

lib/bindings/python/src/dynamo/_core.pyi (4)

component (86-90)

endpoint (105-109)

register_custom_endpoint (118-122)

serve_endpoint (124-136)

lib/bindings/python/rust/lib.rs (4)

lib/bindings/python/src/dynamo/_core.pyi (1)

register_custom_endpoint (118-122)

lib/runtime/src/component.rs (3)

drt (177-179)

drt (397-399)

drt (548-550)

lib/runtime/src/distributed.rs (1)

etcd_client (269-271)

lib/bindings/python/rust/llm/entrypoint.rs (1)

to_pyerr (286-291)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Build and Test - dynamo
GitHub Check: tests (.)
GitHub Check: tests (launch/dynamo-run)
GitHub Check: clippy (.)
GitHub Check: tests (lib/bindings/python)
GitHub Check: clippy (lib/bindings/python)
GitHub Check: tests (lib/runtime/examples)
GitHub Check: clippy (launch/dynamo-run)

🔇 Additional comments (3)

components/src/dynamo/sglang/request_handlers/multimodal_encode_worker_handler.py (1)

8-8: LGTM: Consistent import path update.

This import path change is consistent with the update in multimodal_chat_processor.py (line 6), suggesting a coordinated refactoring of the sglang module structure.

components/src/dynamo/sglang/utils/multimodal_chat_processor.py (1)

6-6: Import path change validated: no remaining imports of sglang.srt.conversation, and the new path aligns with existing usage.

lib/bindings/python/rust/lib.rs (1)

647-678: Document custom etcd key usage and tighten path validation

Clarify why raw endpoint_path (e.g. /foo) is written directly to etcd instead of the dyn:// scheme (is this only for HTTP routing/discovery?).

Strengthen validation: reject a lone /, consecutive slashes (e.g. //foo), and disallow whitespace or control characters.

Consider logging or warning when etcd_client is None to indicate registration was skipped.

Wrap the kv_create error to include the endpoint_path in the Python exception for better debugging.

components/src/dynamo/sglang/main.py

lib/bindings/python/src/dynamo/_core.pyi

Aphoh

Quick refactor and it should look cleaner. Didn't deeply check the functionality.

lib/llm/src/http/service/dynamic_endpoint.rs

ishandhanani · 2025-10-06T20:26:31Z

/ok to test 4502810

lib/llm/src/http/service/dynamic_endpoint.rs

ishandhanani · 2025-10-07T16:14:56Z

Rust checks will stop being mad after #3458 goes into main

Aphoh

Few small nitpicks

lib/llm/src/http/service/service_v2.rs

Aphoh

Almost there: change a panic to a Result::Err and remove an extra dependency

lib/llm/src/http/service/dynamic_endpoint.rs

lib/llm/Cargo.toml

rmccorm4 · 2025-10-07T19:53:00Z

lib/llm/src/http/service/dynamic_endpoint.rs

+/// queries each one, and returns `{"responses": [instance1_result, instance2_result, ...]}`.
+///
+/// Returns 404 if no instances have registered the endpoint.
+async fn inner_dynamic_endpoint_handler(


A general design question - correct me if I'm understanding the changes properly or not

Currently, this change more or less lets you call an arbitrary route curl -X POST localhost:8000/my_custom_route.

If the route is something the exists from another handler (ex: /health, /v1/models, etc.), presumably that specific handler will be invoked instead? Or will it also match this one?

If the route doesn't match any of the other handlers, it will hit this one. At that point in time, per request, we query ETCD to see if any endpoint instances have set an http_endpoint_path key, and if so we will try to route the request to all of those matched instances round robin

This is my understanding of the current changes.

Assuming it's roughly accurate - my next question is why not use something like the watcher pattern we have elsewhere for discovery, that will be watching in background for specific keys (like /http_endpoint_path), and maybe dynamically adds/removes routes to the http server and associates corresponding endpoints/instances in some map for them there, rather than on a per request basis here?

I'm a little hesitant about the additional per-request checking here rather than something more discovery-oriented.

This is not a blocking comment yet, just looking to understand the approach here better, and get more context on use case, etc.

Agreeing with Ryan here. I would rather attach a route for the endpoint when it appears, instead of attaching a wildcard route and hitting etcd on every request.

If the route is something the exists from another handler (ex: /health, /v1/models, etc.), presumably that specific handler will be invoked instead? Or will it also match this one?

I believe it will go to the specific handler.

my next question is why not use something like the watcher pattern we have elsewhere for discovery
associates corresponding endpoints/instances in some map for them there, rather than on a per request basis here?

This is actually how I implemented it before. But I feel like it made things a little bit messy. Why have a separate map when it can all just be under a single endpoint entry? If the endpoint goes down, this will also take the etcd entry down which will also take the http_endpoint section as well.

hitting etcd on every request.

Endpoints that are implemented in this fashion are not meant to be endpoints where we serve heavy traffic by any means. Checking etcd here doesn't seem like it costs much. Why have another watcher?

Endpoints that are implemented in this fashion are not meant to be endpoints where we serve heavy traffic by any means. Checking etcd here doesn't seem like it costs much. Why have another watcher?

I think right now it's a "hey you can expose any endpoint here" feature - so I won't be surprised at all if someone tries calling some native endpoint for some other use case that maybe dynamo doesn't natively support yet but the framework does as a stopgap solution until we do support it. And if so, then we lose this assumption right?

I think I'm less worried optimiznig for the extreme heavy load case on one of these custom endpoints (in terms of expecting it to happen) and moreso just general code smell of doing unnecessary work and checking something on every request if we can instead only act to do the bare minimum when necessary (on discovery). At the end of the day we have limited resources (threads, CPUs, etc.) and the less we use them, the more resources the heavy load endpoints (chat, completions, nats, etcd, etc.) have to work freely with and less we have to worry about later.

For example, if any of these native endpoints are things that may not get heavy load, but may get polled say every second or every few seconds, that could be non trivial at some point. Though this implementation is completely custom support for anything, so I can't really guess what all it would be used for.

This is actually how I implemented it before. But I feel like it made things a little bit messy.

Do you have a draft/commit to refer to the original solution?

Not complete but check out b317939

Spoke to Ryan offline. I think my approach before stemmed from a lack of understanding the watcher pattern and how our ModelWatcher worked.

Refactored in latest commit

lib/llm/src/http/service/dynamic_endpoint.rs

no errors for now

b317939

ishandhanani requested review from a team as code owners October 3, 2025 22:16

pull-request-size bot added the size/M label Oct 3, 2025

github-actions bot added the feat label Oct 3, 2025

ishandhanani marked this pull request as draft October 3, 2025 22:16

coderabbitai bot reviewed Oct 3, 2025

View reviewed changes

components/src/dynamo/sglang/main.py Outdated Show resolved Hide resolved

components/src/dynamo/sglang/main.py Outdated Show resolved Hide resolved

lib/bindings/python/src/dynamo/_core.pyi Outdated Show resolved Hide resolved

bump

6d25ddc

pull-request-size bot added size/L and removed size/M labels Oct 3, 2025

ishandhanani added 2 commits October 3, 2025 22:39

Merge branch 'main' into ishan/late-registration-ye

f90eb88

refactor

3b492f4

pull-request-size bot added size/M and removed size/L labels Oct 3, 2025

found

2124d91

pull-request-size bot added size/L and removed size/M labels Oct 3, 2025

nice

ac9357e

ai-dynamo deleted a comment from coderabbitai bot Oct 4, 2025

err

999c716

Aphoh requested changes Oct 4, 2025

View reviewed changes

lib/llm/src/http/service/dynamic_endpoint.rs Outdated Show resolved Hide resolved

ishandhanani added 6 commits October 5, 2025 00:03

cleanup

889f0c5

bump

9562483

go

48bc524

Merge branch 'main' into ishan/late-registration-ye

ad934a4

working

c2d10ee

go

f3d74b8

copy-pr-bot bot temporarily deployed to GITLAB October 6, 2025 20:26 Inactive

copy-pr-bot bot temporarily deployed to GITLAB October 6, 2025 20:31 Inactive

ishandhanani enabled auto-merge (squash) October 6, 2025 21:44

rmccorm4 reviewed Oct 7, 2025

View reviewed changes

lib/llm/src/http/service/dynamic_endpoint.rs Show resolved Hide resolved

Aphoh mentioned this pull request Oct 7, 2025

feat: add /dump_config to the frontend #3455

Draft

4 tasks

ishandhanani added 2 commits October 7, 2025 15:27

Merge branch 'main' into ishan/late-registration-ye

18f590d

bump

0cb978a

Aphoh requested changes Oct 7, 2025

View reviewed changes

lib/llm/src/http/service/service_v2.rs Outdated Show resolved Hide resolved

ishandhanani added 3 commits October 7, 2025 17:02

Merge branch 'main' into ishan/late-registration-ye

145fb06

swap etcd with drt in http builder

95e3a4f

gp

cdc4da0

Aphoh requested changes Oct 7, 2025

View reviewed changes

lib/llm/src/http/service/dynamic_endpoint.rs Outdated Show resolved Hide resolved

lib/llm/Cargo.toml Outdated Show resolved Hide resolved

grahamking self-requested a review October 7, 2025 18:24

ishandhanani added 4 commits October 7, 2025 19:10

lel

3d79298

test

7ffe3e2

no panicgit add .!

28e268d

bruh

7852920

rmccorm4 reviewed Oct 7, 2025

View reviewed changes

PeaBrane reviewed Oct 7, 2025

View reviewed changes

lib/llm/src/http/service/dynamic_endpoint.rs Show resolved Hide resolved

PeaBrane reviewed Oct 7, 2025

View reviewed changes

lib/llm/src/http/service/dynamic_endpoint.rs Outdated Show resolved Hide resolved

ishandhanani added 2 commits October 7, 2025 20:58

bump

a3e2839

simple

11e8439

ishandhanani requested a review from rmccorm4 October 7, 2025 21:44

full refactor

2b7d61a

pull-request-size bot added size/XL and removed size/L labels Oct 8, 2025

ishandhanani added 2 commits October 8, 2025 06:26

try

6f194ee

handling

c6c39cf

feat: dynamic endpoint registration #3418

Are you sure you want to change the base?

feat: dynamic endpoint registration #3418

Conversation

ishandhanani commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Changes Made

Examples

Follow up

Uh oh!

copy-pr-bot bot commented Oct 3, 2025

Uh oh!

coderabbitai bot commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Aphoh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ishandhanani commented Oct 6, 2025

Uh oh!

Uh oh!

ishandhanani commented Oct 7, 2025

Uh oh!

Aphoh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Aphoh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rmccorm4 Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

rmccorm4 Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

grahamking Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

ishandhanani Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rmccorm4 Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

ishandhanani Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

ishandhanani Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ishandhanani commented Oct 3, 2025 •

edited

Loading

coderabbitai bot commented Oct 3, 2025 •

edited

Loading

ishandhanani Oct 7, 2025 •

edited

Loading