Skip to content

Commit 07cfc3a

Browse files
ryanolsonoandreeva-nvziqifan617John Thompsonrichardhuo-nv
authored
feat: kvbm + connector (#2258)
Signed-off-by: Ryan Olson <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: John Thompson <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Zicheng Ma <[email protected]>
1 parent bf5862a commit 07cfc3a

File tree

110 files changed

+22252
-5518
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

110 files changed

+22252
-5518
lines changed

Cargo.lock

Lines changed: 714 additions & 627 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

container/Dockerfile.kvbm

Lines changed: 497 additions & 0 deletions
Large diffs are not rendered by default.

container/build.sh

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ PYTHON_PACKAGE_VERSION=${current_tag:-$latest_tag.dev+$commit_id}
4949
# dependencies are specified in the /container/deps folder and
5050
# installed within framework specific sections of the Dockerfile.
5151

52-
declare -A FRAMEWORKS=(["VLLM"]=1 ["TRTLLM"]=2 ["NONE"]=3 ["SGLANG"]=4)
52+
declare -A FRAMEWORKS=(["VLLM"]=1 ["TRTLLM"]=2 ["NONE"]=3 ["SGLANG"]=4 ["KVBM"]=5)
53+
5354
DEFAULT_FRAMEWORK=VLLM
5455

5556
SOURCE_DIR=$(dirname "$(readlink -f "$0")")
@@ -414,6 +415,8 @@ elif [[ $FRAMEWORK == "NONE" ]]; then
414415
DOCKERFILE=${SOURCE_DIR}/Dockerfile
415416
elif [[ $FRAMEWORK == "SGLANG" ]]; then
416417
DOCKERFILE=${SOURCE_DIR}/Dockerfile.sglang
418+
elif [[ $FRAMEWORK == "KVBM" ]]; then
419+
DOCKERFILE=${SOURCE_DIR}/Dockerfile.kvbm
417420
fi
418421

419422
# Add NIXL_REF as a build argument

container/deps/requirements.test.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515

16+
# For IFEval dataset loading in kvbm tests
17+
datasets
1618
psutil>=5.0.0
1719
pyright
1820
pytest

container/run.sh

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,8 @@ RUN_PREFIX=
2424
# dependencies are specified in the /container/deps folder and
2525
# installed within framework specific sections of the Dockerfile.
2626

27-
declare -A FRAMEWORKS=(["VLLM"]=1 ["TRTLLM"]=2 ["NONE"]=3 ["SGLANG"]=4)
27+
declare -A FRAMEWORKS=(["VLLM"]=1 ["TRTLLM"]=2 ["NONE"]=3 ["SGLANG"]=4 ["KVBM"]=5)
28+
2829
DEFAULT_FRAMEWORK=VLLM
2930

3031
SOURCE_DIR=$(dirname "$(readlink -f "$0")")
@@ -276,6 +277,14 @@ get_options() {
276277
if [ -n "$USE_NIXL_GDS" ]; then
277278
VOLUME_MOUNTS+=" -v /run/udev:/run/udev:ro "
278279
NIXL_GDS_CAPS="--cap-add=IPC_LOCK"
280+
281+
# NOTE(jthomson04): In the KVBM disk pools, we currently allocate our files in /tmp.
282+
# For some arcane reason, GDS requires that /tmp be mounted.
283+
# This is already handled for us if we set --mount-workspace
284+
# If we aren't mounting our workspace but need GDS, we need to mount /tmp.
285+
if [ -z "$MOUNT_WORKSPACE" ]; then
286+
VOLUME_MOUNTS+=" -v /tmp:/tmp "
287+
fi
279288
else
280289
NIXL_GDS_CAPS=""
281290
fi

docs/guides/run_kvbm_in_vllm.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
<!--
2+
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
SPDX-License-Identifier: Apache-2.0
4+
5+
Licensed under the Apache License, Version 2.0 (the "License");
6+
you may not use this file except in compliance with the License.
7+
You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing, software
12+
distributed under the License is distributed on an "AS IS" BASIS,
13+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
See the License for the specific language governing permissions and
15+
limitations under the License.
16+
-->
17+
18+
# Running KVBM in vLLM
19+
20+
This guide explains how to leverage KVBM (KV Block Manager) to mange KV cache and do KV offloading in vLLM.
21+
22+
To learn what KVBM is, please check [here](https://docs.nvidia.com/dynamo/latest/architecture/kvbm_intro.html)
23+
24+
## Quick Start
25+
26+
To use KVBM in vLLM, you can follow the steps below:
27+
28+
```bash
29+
# start up etcd for KVBM leader/worker registration and discovery
30+
docker compose -f deploy/metrics/docker-compose.yml up -d
31+
32+
# build a container containing vllm and kvbm
33+
./container/build.sh --framework kvbm
34+
35+
# launch the container
36+
./container/run.sh --framework kvbm -it --mount-workspace --use-nixl-gds
37+
38+
# enable using kvbm instead of vllm's own kv cache manager
39+
export DYN_KVBM_MANAGER=kvbm
40+
41+
# enable kv offloading to CPU memory
42+
# 4 means 4GB of CPU memory would be used
43+
export DYN_KVBM_CPU_CACHE_GB=4
44+
45+
# enable kv offloading to disk
46+
# 8 means 8GB of disk would be used
47+
export DYN_KVBM_DISK_CACHE_GB=8
48+
49+
# serve an example LLM model
50+
vllm serve deepseek-ai/DeepSeek-R1-Distill-Llama-8B
51+
52+
# make a call to LLM
53+
curl localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
54+
"model": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
55+
"messages": [
56+
{
57+
"role": "user",
58+
"content": "In the heart of Eldoria, an ancient land of boundless magic and mysterious creatures, lies the long-forgotten city of Aeloria. Once a beacon of knowledge and power, Aeloria was buried beneath the shifting sands of time, lost to the world for centuries. You are an intrepid explorer, known for your unparalleled curiosity and courage, who has stumbled upon an ancient map hinting at ests that Aeloria holds a secret so profound that it has the potential to reshape the very fabric of reality. Your journey will take you through treacherous deserts, enchanted forests, and across perilous mountain ranges. Your Task: Character Background: Develop a detailed background for your character. Describe their motivations for seeking out Aeloria, their skills and weaknesses, and any personal connections to the ancient city or its legends. Are they driven by a quest for knowledge, a search for lost familt clue is hidden."
59+
}
60+
],
61+
"stream":false,
62+
"max_tokens": 30
63+
}'
64+
```

0 commit comments

Comments
 (0)