Skip to content

EngineCore crashes when physical KV pool is exhausted under multi-instance load #262

@shenrunzhang

Description

@shenrunzhang

When running multiple kvcached instances under heavy concurrent load, one instance can crash with a fatal AssertionError in
ElasticBlockPool.get_new_blocks().

The root cause is that available_size() (used by vllm's scheduler to check if blocks are available) and the actual alloc()
call are not atomic with respect to the shared physical pool. Another instance can consume physical pages between the check
and the allocation, causing alloc() to return None and the subsequent assert block_ids is not None to fire, killing the
EngineCore.

Environment

  • GPU: AMD MI300X (192 GB)
  • vLLM: 0.14.0
  • kvcached: (repo main)
  • Setup: 6× Qwen2.5-7B-Instruct instances, kvcached_gpu_utilization=0.90

Steps to Reproduce

  1. Launch 6 vllm instances sharing a kvcached physical pool
  2. Run a staggered load sweep with long completions (e.g. completion_len=2048, peak_rps=20) across all instances
  3. When multiple instances are simultaneously draining heavy backlogs (high KV cache usage), one instance crashes

Logs (see attached):

[kvcached][WARNING] kv_cache_manager.py:174 available_size()=71 < need_size=76
ERROR core.py:938 EngineCore encountered a fatal error.
...
scheduler.schedule()
→ kv_cache_manager.allocate_slots()
→ coordinator.allocate_new_blocks()
→ block_pool.get_new_blocks(num_new_blocks) ← AssertionError

Expected behavior:

When alloc() returns None, the engine should handle it gracefully rather than crash.

Notes:

The reserved_page_list mechanism partially mitigates this race for pre-mapped pages, but does not cover cases where the
needed allocation exceeds the reservation buffer.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions