Skip to content

Conversation

alec-flowers
Copy link
Contributor

@alec-flowers alec-flowers commented Aug 19, 2025

Overview:

Previously, we had under the hood handled overriding kv_transfer_config in vLLM. The main reason is it is a json argument and is quite verbose and at the time I had imagined we wanted to always enable NIXL.

With lmcache and kvbm being added in we are seeing more connectors. There are also multiple people who don't necessarily have nixl installed and want a way to turn off the nixl connector.

This refactor addresses this by allowing the user more control over --kv-transfer-config.

Details:

User can specify --kv-transfer-config and this will override everything else. This gives the most flexibility.
As a helper flag the user can also specify --connector and we will set up the connectors for them. The args we support art --connector [nixl, kvbm, lmcache] with the default being the nixl connector. This is to keep the old behavior impact. If you want to have no connector you can specify --connector none or --connector null or --connector and just leave it blank.

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features
    • Added a --connector option to select KV transfer connectors (supports multiple connectors).
    • Automatic side-channel port allocation when a connector requires it; no manual port setup needed.
  • Chores
    • Launcher scripts now explicitly set connectors (defaulting to none unless LMCache is requested).
  • Refactor
    • Simplified configuration by deriving KV transfer setup from selected connectors.
  • Style
    • Reduced log verbosity: certain messages moved from INFO to DEBUG for cleaner runtime logs.

@alec-flowers alec-flowers force-pushed the aflowers/kv-transfer-config-updates branch from c380e13 to f203490 Compare August 20, 2025 05:22
@alec-flowers
Copy link
Contributor Author

@ZichengMa What do you think of this? The user has to do more to set up lmcache correctly (using the flags), however it gives them also more control and flexibility.

Copy link
Contributor

coderabbitai bot commented Aug 20, 2025

Caution

Review failed

Failed to post review comments.

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1945f59 and c6eda51.

📒 Files selected for processing (5)
  • components/backends/vllm/launch/agg.sh (1 hunks)
  • components/backends/vllm/launch/agg_lmcache.sh (1 hunks)
  • components/backends/vllm/launch/agg_router.sh (1 hunks)
  • components/backends/vllm/src/dynamo/vllm/args.py (7 hunks)
  • components/backends/vllm/src/dynamo/vllm/main.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
components/backends/vllm/src/dynamo/vllm/main.py (1)
tests/conftest.py (1)
  • logger (124-135)
components/backends/vllm/src/dynamo/vllm/args.py (1)
components/backends/vllm/src/dynamo/vllm/ports.py (4)
  • PortMetadata (57-74)
  • PortAllocationRequest (78-85)
  • allocate_and_reserve_port_block (142-226)
  • get_host_ip (263-290)
🔇 Additional comments (5)
components/backends/vllm/launch/agg_lmcache.sh (1)

15-15: LMCache launcher flag is aligned with connector model

Explicit --connector lmcache + ENABLE_LMCACHE=1 is consistent with the new connector flow and will ensure LMCache defaults are applied.

components/backends/vllm/launch/agg_router.sh (1)

12-12: Router launchers correctly opt out of connectors; confirm prefix-caching expectations

Using --connector none for both workers matches the new semantics. As with agg.sh, without explicitly disabling prefix caching, this will still enable KV events and allocate a ZMQ port by default. Confirm that’s intended for router mode; otherwise add --no-enable-prefix-caching.

Also applies to: 14-14

components/backends/vllm/src/dynamo/vllm/args.py (3)

213-255: NIXL side-channel port allocation logic LGTM

The dp-rank/TP-size aware block allocation, offset math, and negative base-port guard look solid. Using ETCD-backed reservation with block allocation avoids TOCTOU issues.


284-297: Connector → KVTransferConfig mapping looks correct

Single and multi-connector cases map to expected connector classes and include the module path for DynamoConnector. Returning None when a user supplies --kv-transfer-config is the right deferral.

Also applies to: 299-321


362-370: All set_side_channel_host_and_port calls now use the single-port signature

Verified that:

  • In components/backends/vllm/src/dynamo/vllm/args.py (line 254), the call passes only base_side_channel_port.
  • The only other call in examples/multimodal/utils/args.py invokes a separate, locally defined function—its signature still accepts (config, hostname=None).

No remaining calls with two or more arguments.

Walkthrough

Adds a connector-based KV-transfer configuration to vLLM: new CLI flag --connector, validation, and dynamic side-channel port allocation only when needed (e.g., NIXL). Updates launcher scripts to pass --connector values (none/lmcache). Adjusts logging levels in main.py. Removes side_channel_port from Config and introduces connector_list.

Changes

Cohort / File(s) Summary
Launch scripts: connector flag updates
components/backends/vllm/launch/agg.sh, components/backends/vllm/launch/agg_router.sh, components/backends/vllm/launch/agg_lmcache.sh
Replace/remove legacy flags and add explicit connector selection: use --connector none in agg.sh and agg_router.sh; use --connector lmcache in agg_lmcache.sh. No other flow changes.
vLLM args: connector-based KV transfer and ports
components/backends/vllm/src/dynamo/vllm/args.py
Introduces VALID_CONNECTORS, adds Config.connector_list, removes Config.side_channel_port, adds create_kv_transfer_config, updates CLI parsing for --connector, adjusts port allocation only when connectors require NIXL, updates set_side_channel_host_and_port signature, integrates connector-driven KVTransferConfig creation.
vLLM main: log level tweaks
components/backends/vllm/src/dynamo/vllm/main.py
Downgrades two INFO logs to DEBUG (signal handler setup and LMCache disabled notice). No functional changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant Launcher as Shell launcher (agg*.sh)
  participant CLI as dynamo.vllm (CLI)
  participant Args as Args parser
  participant Ports as Port allocator
  participant Engine as vLLM Engine

  User->>Launcher: run agg*.sh (--connector X)
  Launcher->>CLI: invoke dynamo.vllm with flags
  CLI->>Args: parse args (--connector, --kv-transfer-config, ...)
  Args->>Args: validate connectors (none/lmcache/nixl/kvbm)
  alt user provided --kv-transfer-config
    Args-->>Engine: use provided KVTransferConfig
  else connector-based config
    Args->>Args: create_kv_transfer_config()
    alt connectors include "nixl"
      Args->>Ports: allocate side-channel ports (per DP rank)
      Ports-->>Args: base_side_channel_port
      Args->>Args: set_side_channel_host_and_port(port)
    end
    Args-->>Engine: KVTransferConfig (LMCache/Nixl/KVBM/Multi)
  end
  Engine-->>User: start model service
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

I twitch my whiskers at flags newly shown,
Connectors aligned, each path clearly known.
No side-channel burrow unless NIXL’s in sight,
LMCache carrots cached, nibble-byte by byte.
Quiet logs whisper while engines awake—
Hop, hop, deploy; clean trails we make. 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@ZichengMa
Copy link
Contributor

I think this makes sense to me. But we may need to specify this in our documents.

@alec-flowers alec-flowers enabled auto-merge (squash) August 20, 2025 21:55
@alec-flowers alec-flowers merged commit bc290e7 into main Aug 20, 2025
12 of 13 checks passed
@alec-flowers alec-flowers deleted the aflowers/kv-transfer-config-updates branch August 20, 2025 22:20
hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants