Skip to content

Conversation

@zhenhuang12
Copy link
Collaborator

@zhenhuang12 zhenhuang12 commented Jan 5, 2026

Description

  1. Enable internode_dispatch GPU-CPU no sync by supporting num_wosrt_tokens

ref pr: Support CUDA Graph for internode dispatch normal kernel #438

  1. Add get_oob_ip for uccl_ep

Fixes # (issue)

Type of Change

  • Bug fix
  • New feature
  • Documentation update

How Has This Been Tested?

Include any tests here.

  • Unit tests
  • Integration tests
  • Manual testing

Checklist

  • My code follows the style guidelines, e.g. format.sh.
  • I have run build_and_install.sh to verify compilation.
  • I have removed redundant variables and comments.
  • I have updated the documentation.
  • I have added tests.

@zhenhuang12 zhenhuang12 force-pushed the ep-internode-cuda-graph branch from af9d1c9 to bfb9ade Compare January 5, 2026 08:48
Copy link
Member

@MaoZiming MaoZiming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhenhuang12 Thank you. I just tested too and it worked. LGTM

@MaoZiming MaoZiming merged commit 1d56447 into main Jan 5, 2026
3 checks passed
@MaoZiming MaoZiming deleted the ep-internode-cuda-graph branch January 5, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants