feat: optimize kv cache load/offload. #306

Kang-Meng · 2025-10-31T04:39:59Z

No description provided.

xllm/core/framework/kv_cache/kv_cache_store.cpp

RobbieLeung · 2025-11-05T13:27:40Z

xllm/models/llm/glm4_moe_mtp.h

+          return torch::Tensor();
+        }
+      }
+


Add a TODO tag, MTP need more support.

…-wise.

RobbieLeung

LGTM

liutongxuan · 2025-11-06T13:43:13Z

xllm/core/distributed_runtime/worker_service.cpp

+ public:
+  ~ServerStreamHandler() {
+    if (!promise_set_.exchange(true)) {
+      try {


why use try catch here?

liutongxuan · 2025-11-06T13:45:30Z

xllm/core/distributed_runtime/worker_service.h

  std::unique_ptr<std::thread> polling_thread_;

  std::unique_ptr<ThreadPool> threadpool_;
+  ThreadPool copy_threadpool_{5};


why 5 threads? ？？

Kang-Meng requested review from RobbieLeung, liutongxuan, walsonyang and yq33victor October 31, 2025 04:39

Kang-Meng force-pushed the feat_async_copy branch 3 times, most recently from 17797ce to 034f86e Compare November 5, 2025 10:13

RobbieLeung reviewed Nov 5, 2025

View reviewed changes

Kang-Meng force-pushed the feat_async_copy branch from 2914f67 to 7cd5bd4 Compare November 6, 2025 06:31

Kang-Meng added 6 commits November 6, 2025 14:33

refactor: async device/host block copying, remove sync waits.

a09f543

feat: upgrade Mooncake to v0.3.6.

dd6a225

refactor: change host KV cache memory layout from layer-wise to block…

1c902b7

…-wise.

feat: add layer-wise KV cache H2D copy optimization.

441dd15

feat: implement batch prefetch from store.

012fe67

feat: add dependency installation to setup script.

d4446aa

Kang-Meng force-pushed the feat_async_copy branch from 7cd5bd4 to d4446aa Compare November 6, 2025 06:46

RobbieLeung previously approved these changes Nov 6, 2025

View reviewed changes

liutongxuan reviewed Nov 6, 2025

View reviewed changes

feat: add page-aligned tensor creator for host KV cache.

844f089

Kang-Meng dismissed RobbieLeung’s stale review via 844f089 November 6, 2025 15:42

feat: support contributing memory to distributed KV cache storage.

349a778

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: optimize kv cache load/offload. #306

feat: optimize kv cache load/offload. #306

Uh oh!

Kang-Meng commented Oct 31, 2025

Uh oh!

Uh oh!

Uh oh!

RobbieLeung Nov 5, 2025

Uh oh!

RobbieLeung left a comment

Uh oh!

liutongxuan Nov 6, 2025

Uh oh!

liutongxuan Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: optimize kv cache load/offload. #306

Are you sure you want to change the base?

feat: optimize kv cache load/offload. #306

Uh oh!

Conversation

Kang-Meng commented Oct 31, 2025

Uh oh!

Uh oh!

Uh oh!

RobbieLeung Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

RobbieLeung left a comment

Choose a reason for hiding this comment

Uh oh!

liutongxuan Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

liutongxuan Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants