feat: allow user to adjust kv_transfer_config #2517

alec-flowers · 2025-08-19T05:22:30Z

Overview:

Previously, we had under the hood handled overriding kv_transfer_config in vLLM. The main reason is it is a json argument and is quite verbose and at the time I had imagined we wanted to always enable NIXL.

With lmcache and kvbm being added in we are seeing more connectors. There are also multiple people who don't necessarily have nixl installed and want a way to turn off the nixl connector.

This refactor addresses this by allowing the user more control over --kv-transfer-config.

Details:

User can specify --kv-transfer-config and this will override everything else. This gives the most flexibility.
As a helper flag the user can also specify --connector and we will set up the connectors for them. The args we support art --connector [nixl, kvbm, lmcache] with the default being the nixl connector. This is to keep the old behavior impact. If you want to have no connector you can specify --connector none or --connector null or --connector and just leave it blank.

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

New Features
- Added a --connector option to select KV transfer connectors (supports multiple connectors).
- Automatic side-channel port allocation when a connector requires it; no manual port setup needed.
Chores
- Launcher scripts now explicitly set connectors (defaulting to none unless LMCache is requested).
Refactor
- Simplified configuration by deriving KV transfer setup from selected connectors.
Style
- Reduced log verbosity: certain messages moved from INFO to DEBUG for cleaner runtime logs.

alec-flowers · 2025-08-20T05:23:44Z

@ZichengMa What do you think of this? The user has to do more to set up lmcache correctly (using the flags), however it gives them also more control and flexibility.

coderabbitai · 2025-08-20T05:24:00Z

Caution

Review failed

Failed to post review comments.

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1945f59 and c6eda51.

📒 Files selected for processing (5)

components/backends/vllm/launch/agg.sh (1 hunks)
components/backends/vllm/launch/agg_lmcache.sh (1 hunks)
components/backends/vllm/launch/agg_router.sh (1 hunks)
components/backends/vllm/src/dynamo/vllm/args.py (7 hunks)
components/backends/vllm/src/dynamo/vllm/main.py (2 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (2)

components/backends/vllm/src/dynamo/vllm/main.py (1)

tests/conftest.py (1)

logger (124-135)

components/backends/vllm/src/dynamo/vllm/args.py (1)

components/backends/vllm/src/dynamo/vllm/ports.py (4)

PortMetadata (57-74)

PortAllocationRequest (78-85)

allocate_and_reserve_port_block (142-226)

get_host_ip (263-290)

🔇 Additional comments (5)

components/backends/vllm/launch/agg_lmcache.sh (1)

15-15: LMCache launcher flag is aligned with connector model

Explicit --connector lmcache + ENABLE_LMCACHE=1 is consistent with the new connector flow and will ensure LMCache defaults are applied.

components/backends/vllm/launch/agg_router.sh (1)

12-12: Router launchers correctly opt out of connectors; confirm prefix-caching expectations

Using --connector none for both workers matches the new semantics. As with agg.sh, without explicitly disabling prefix caching, this will still enable KV events and allocate a ZMQ port by default. Confirm that’s intended for router mode; otherwise add --no-enable-prefix-caching.

Also applies to: 14-14

components/backends/vllm/src/dynamo/vllm/args.py (3)

213-255: NIXL side-channel port allocation logic LGTM

The dp-rank/TP-size aware block allocation, offset math, and negative base-port guard look solid. Using ETCD-backed reservation with block allocation avoids TOCTOU issues.

284-297: Connector → KVTransferConfig mapping looks correct

Single and multi-connector cases map to expected connector classes and include the module path for DynamoConnector. Returning None when a user supplies --kv-transfer-config is the right deferral.

Also applies to: 299-321

362-370: All set_side_channel_host_and_port calls now use the single-port signature

Verified that:

In components/backends/vllm/src/dynamo/vllm/args.py (line 254), the call passes only base_side_channel_port.

The only other call in examples/multimodal/utils/args.py invokes a separate, locally defined function—its signature still accepts (config, hostname=None).

No remaining calls with two or more arguments.

Walkthrough

Adds a connector-based KV-transfer configuration to vLLM: new CLI flag --connector, validation, and dynamic side-channel port allocation only when needed (e.g., NIXL). Updates launcher scripts to pass --connector values (none/lmcache). Adjusts logging levels in main.py. Removes side_channel_port from Config and introduces connector_list.

Changes

Cohort / File(s)	Summary
Launch scripts: connector flag updates `components/backends/vllm/launch/agg.sh`, `components/backends/vllm/launch/agg_router.sh`, `components/backends/vllm/launch/agg_lmcache.sh`	Replace/remove legacy flags and add explicit connector selection: use `--connector none` in agg.sh and agg_router.sh; use `--connector lmcache` in agg_lmcache.sh. No other flow changes.
vLLM args: connector-based KV transfer and ports `components/backends/vllm/src/dynamo/vllm/args.py`	Introduces VALID_CONNECTORS, adds Config.connector_list, removes Config.side_channel_port, adds create_kv_transfer_config, updates CLI parsing for --connector, adjusts port allocation only when connectors require NIXL, updates set_side_channel_host_and_port signature, integrates connector-driven KVTransferConfig creation.
vLLM main: log level tweaks `components/backends/vllm/src/dynamo/vllm/main.py`	Downgrades two INFO logs to DEBUG (signal handler setup and LMCache disabled notice). No functional changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant Launcher as Shell launcher (agg*.sh)
  participant CLI as dynamo.vllm (CLI)
  participant Args as Args parser
  participant Ports as Port allocator
  participant Engine as vLLM Engine

  User->>Launcher: run agg*.sh (--connector X)
  Launcher->>CLI: invoke dynamo.vllm with flags
  CLI->>Args: parse args (--connector, --kv-transfer-config, ...)
  Args->>Args: validate connectors (none/lmcache/nixl/kvbm)
  alt user provided --kv-transfer-config
    Args-->>Engine: use provided KVTransferConfig
  else connector-based config
    Args->>Args: create_kv_transfer_config()
    alt connectors include "nixl"
      Args->>Ports: allocate side-channel ports (per DP rank)
      Ports-->>Args: base_side_channel_port
      Args->>Args: set_side_channel_host_and_port(port)
    end
    Args-->>Engine: KVTransferConfig (LMCache/Nixl/KVBM/Multi)
  end
  Engine-->>User: start model service

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: LMCache integration in newest ux #2079 — Prior LMCache/KV-transfer config work in vLLM; this PR supersedes/env-refactors to a connector-based API and related logging/args handling.

Poem

I twitch my whiskers at flags newly shown,
Connectors aligned, each path clearly known.
No side-channel burrow unless NIXL’s in sight,
LMCache carrots cached, nibble-byte by byte.
Quiet logs whisper while engines awake—
Hop, hop, deploy; clean trails we make. 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

ZichengMa · 2025-08-20T16:41:06Z

I think this makes sense to me. But we may need to specify this in our documents.

components/backends/vllm/src/dynamo/vllm/args.py

Signed-off-by: Hannah Zhang <[email protected]>

pull-request-size bot added the size/L label Aug 19, 2025

copy-pr-bot bot temporarily deployed to GITLAB August 19, 2025 05:22 Inactive

github-actions bot added the feat label Aug 19, 2025

copy-pr-bot bot temporarily deployed to GITLAB August 19, 2025 05:23 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 05:14 Inactive

alec-flowers marked this pull request as ready for review August 20, 2025 05:18

alec-flowers requested review from nnshah1, tanmayv25, piotrm-nvidia, ptarasiewiczNV, ryanolson, grahamking, paulhendricks, biswapanda, tmonty12, GuanLuo, rmccorm4, tedzhouhk, kkranen, jthomson04, ishandhanani, PeaBrane, hhzhang16 and a team as code owners August 20, 2025 05:18

allow user to adjust kv_transfer_config

c10acdd

alec-flowers force-pushed the aflowers/kv-transfer-config-updates branch from c6eda51 to c380e13 Compare August 20, 2025 05:20

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 05:20 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 05:21 Inactive

--connector flag to simplify

f203490

alec-flowers force-pushed the aflowers/kv-transfer-config-updates branch from c380e13 to f203490 Compare August 20, 2025 05:22

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 05:23 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 05:26 Inactive

ZichengMa reviewed Aug 20, 2025

View reviewed changes

components/backends/vllm/src/dynamo/vllm/args.py Outdated Show resolved Hide resolved

order matters for connectors

f300b72

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 21:48 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 21:49 Inactive

ishandhanani approved these changes Aug 20, 2025

View reviewed changes

add doc

b0c2459

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 21:54 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 20, 2025 21:55 Inactive

alec-flowers enabled auto-merge (squash) August 20, 2025 21:55

alec-flowers merged commit bc290e7 into main Aug 20, 2025
12 of 13 checks passed

alec-flowers deleted the aflowers/kv-transfer-config-updates branch August 20, 2025 22:20

hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025

feat: allow user to adjust kv_transfer_config (#2517)

5a3e7a9

Signed-off-by: Hannah Zhang <[email protected]>

nv-anants pushed a commit that referenced this pull request Aug 28, 2025

feat: allow user to adjust kv_transfer_config (#2517)

a124d7f

KrishnanPrash pushed a commit that referenced this pull request Sep 2, 2025

feat: allow user to adjust kv_transfer_config (#2517)

570ef5a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: allow user to adjust kv_transfer_config #2517

feat: allow user to adjust kv_transfer_config #2517

Uh oh!

alec-flowers commented Aug 19, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

alec-flowers commented Aug 20, 2025

Uh oh!

coderabbitai bot commented Aug 20, 2025 •

edited

Loading

Review failed

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

ZichengMa commented Aug 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

feat: allow user to adjust kv_transfer_config #2517

feat: allow user to adjust kv_transfer_config #2517

Uh oh!

Conversation

alec-flowers commented Aug 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

alec-flowers commented Aug 20, 2025

Uh oh!

coderabbitai bot commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

ZichengMa commented Aug 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alec-flowers commented Aug 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 20, 2025 •

edited

Loading