Skip to content

[TRTLLM-12762][test] Add multi-node TP coverage for MiniMax-M2#15361

Open
jieli-matrix wants to merge 1 commit into
NVIDIA:mainfrom
jieli-matrix:test/minimax-m2-multinode-tp
Open

[TRTLLM-12762][test] Add multi-node TP coverage for MiniMax-M2#15361
jieli-matrix wants to merge 1 commit into
NVIDIA:mainfrom
jieli-matrix:test/minimax-m2-multinode-tp

Conversation

@jieli-matrix

@jieli-matrix jieli-matrix commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Add a MiniMax-M2 (FP8 block-scales) case to test_multi_nodes_eval at TP16/PP1/EP16, covering the cross-node tensor-parallel fallback added in PR #14314: NCCL all-reduce RMS norm, per-tensor QK-norm, and head-level weight replication (num_kv_heads=8 < tp_size=16). The case is marked skip_pre_hopper and reuses the existing MMLU threshold. Registered the TP16 case in the QA multi-node functional test list.

Summary by CodeRabbit

  • Tests
    • Added test coverage for MiniMax-M2 model in multi-node evaluation scenarios.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Add a MiniMax-M2 (FP8 block-scales) case to test_multi_nodes_eval at
TP16/PP1/EP16, covering the cross-node tensor-parallel fallback added in
PR NVIDIA#14314: NCCL all-reduce RMS norm, per-tensor QK-norm, and head-level
weight replication (num_kv_heads=8 < tp_size=16). The case is marked
skip_pre_hopper and reuses the existing MMLU threshold. Registered the
TP16 case in the QA multi-node functional test list.

Signed-off-by: Jie Li <lijie@nvidia.com>
@jieli-matrix jieli-matrix self-assigned this Jun 15, 2026
@jieli-matrix jieli-matrix requested review from a team as code owners June 15, 2026 03:45
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b56e9e77-c9cc-4514-ba8b-d7bb2596dc0c

📥 Commits

Reviewing files that changed from the base of the PR and between aa3236b and a8e18bc.

📒 Files selected for processing (2)
  • tests/integration/defs/test_e2e.py
  • tests/integration/test_lists/qa/llm_function_multinode.txt

📝 Walkthrough

Walkthrough

Adds MiniMax-M2 as a new parametrized model entry in test_multi_nodes_eval with the skip_pre_hopper marker, and registers the corresponding test_e2e.py::test_multi_nodes_eval[MiniMax-M2-tp16-mmlu] test identifier in the QA multi-node function test list.

Changes

MiniMax-M2 Multi-Node Eval Registration

Layer / File(s) Summary
MiniMax-M2 test parameter and QA list entry
tests/integration/defs/test_e2e.py, tests/integration/test_lists/qa/llm_function_multinode.txt
Adds pytest.param('MiniMax-M2', marks=skip_pre_hopper) to test_multi_nodes_eval parametrization and adds the MiniMax-M2-tp16-mmlu test identifier to the QA multi-node list.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding MiniMax-M2 model coverage to the multi-node tensor-parallel test suite.
Description check ✅ Passed The PR description is mostly complete with a detailed explanation of what was added and why, covering the technical details of the test case and its purpose. However, it lacks a formal "Description" section and "Test Coverage" section as specified in the template structure.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant