[TRTLLM-12721][fix] Bound V2 context transfer polling by chienchunhung · Pull Request #15356 · NVIDIA/TensorRT-LLM

chienchunhung · 2026-06-14T19:33:08Z

Summary

Follow-up draft PR for V2 Python transceiver bounded polling.

This keeps the V2 change separate from:

[TRTLLM-12721][fix] Bound disagg transfer status polling #15181: C++ disagg transfer status bounded polling
[TRTLLM-12721][fix] Add gated disagg in-flight cancellation #15238: gated C++ in-flight cancellation and transfer-buffer poison/quarantine

Behavior

Add an explicit blocking argument to V2 TxSession.wait_complete().
Preserve the raw TxSession API default as blocking, so existing direct transfer tests and callers keep their previous completion-barrier behavior.
Make KvCacheTransceiverV2.check_context_transfer_status(at_least_request_num=...) use nonblocking TxSession polling for bounded scheduler polls.
Treat None from the nonblocking TxSession poll as "not ready yet"; the request remains queued and is polled again later.
Preserve blockAll / at_least_request_num=None as a blocking wait-all path.

Relationship to #15181

This PR is related to #15181's bounded-polling contract, but it is intentionally V2/Python-transceiver only and is opened against main to keep the diff focused. It can be reviewed after or alongside #15181 without carrying the C++ transceiver changes in this PR.

Local validation

PYTHONPYCACHEPREFIX=/private/tmp/trtllm-pycache python -m py_compile tensorrt_llm/_torch/disaggregation/base/transfer.py tensorrt_llm/_torch/disaggregation/transceiver.py tensorrt_llm/_torch/disaggregation/native/transfer.py
git diff --check upstream/main..HEAD
PATH=/opt/miniconda3/bin:$PATH PRE_COMMIT_HOME=/private/tmp/trtllm-pre-commit pre-commit run --files tensorrt_llm/_torch/disaggregation/base/transfer.py tensorrt_llm/_torch/disaggregation/transceiver.py tensorrt_llm/_torch/disaggregation/native/transfer.py

Focused pytest attempt was blocked locally by missing dependency transformers.

Signed-off-by: Chien-Chun Hung <2679986+chienchunhung@users.noreply.github.com>

chienchunhung · 2026-06-14T20:07:44Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-06-14T20:13:22Z

PR_Github #54156 [ run ] triggered by Bot. Commit: cdb81d1 Link to invocation

tensorrt-cicd · 2026-06-15T00:35:47Z

PR_Github #54156 [ run ] completed with state SUCCESS. Commit: cdb81d1
/LLM/main/L0_MergeRequest_PR pipeline #43239 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

[TRTLLM-12721][fix] Bound V2 context transfer polling

1c2e4fa

Signed-off-by: Chien-Chun Hung <2679986+chienchunhung@users.noreply.github.com>

github-actions Bot assigned chienchunhung Jun 14, 2026

Add V2 bounded polling tests

cdb81d1

Signed-off-by: Chien-Chun Hung <2679986+chienchunhung@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TRTLLM-12721][fix] Bound V2 context transfer polling#15356

[TRTLLM-12721][fix] Bound V2 context transfer polling#15356
chienchunhung wants to merge 2 commits into
NVIDIA:mainfrom
chienchunhung:codex/disagg-v2-bounded-transfer-poll

chienchunhung commented Jun 14, 2026

Uh oh!

chienchunhung commented Jun 14, 2026

Uh oh!

tensorrt-cicd commented Jun 14, 2026

Uh oh!

tensorrt-cicd commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chienchunhung commented Jun 14, 2026

Summary

Behavior

Relationship to #15181

Local validation

Uh oh!

chienchunhung commented Jun 14, 2026

Uh oh!

tensorrt-cicd commented Jun 14, 2026

Uh oh!

tensorrt-cicd commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants