Skip to content

Fix/pr319 restore resize#323

Open
cui36 wants to merge 8 commits intomainfrom
fix/pr319-restore-resize
Open

Fix/pr319 restore resize#323
cui36 wants to merge 8 commits intomainfrom
fix/pr319-restore-resize

Conversation

@cui36
Copy link
Copy Markdown
Collaborator

@cui36 cui36 commented May 3, 2026

PR #319 alloc/free microbench

bench_alloc.py: tight loop of manager.alloc(k) + manager.free(h) after 100-iter warmup.
GPU: NVIDIA GB10. NUM_LAYERS=16, NUM_BLOCKS=65536, BLOCK_SIZE=16, page_size=2 MB.

Results (μs per alloc+free pair)

k main
Python alloc + Python poll
PR #319 HEAD
incl. must-fix 98d9bb3
(C++ alloc, no resize poll)
fix branch
(C++ alloc + restored resize poll)
1 66.75 23.94 29.18
4 62.86 23.84 30.13
16 65.76 22.69 30.84
64 66.16 25.58 31.69
256 91.65 43.81 46.29

lianghao208 and others added 4 commits April 30, 2026 14:04
- Bind PageAllocator::check_and_get_resize_target so kv_cache_manager can poll

- Pin KVCACHED_IPC_NAME from Python so C++ MemInfoTracker uses the same shm

- Include pybind11/functional.h and pybind11/stl.h in torch_bindings

- Update test_kvcache_manager.py to use C++ public methods
Times alloc(k) + free(handles) cycles at varying k. Used to compare main vs C++ migration vs C++ + restored elastic resize.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants