Skip to content

[veomni] feat: add DeepSeek-V3 to MOE_PARAM_HANDERS#5996

Open
Luosuu wants to merge 1 commit intoverl-project:mainfrom
Luosuu:tianle/add-dpskv3-moe-param-map
Open

[veomni] feat: add DeepSeek-V3 to MOE_PARAM_HANDERS#5996
Luosuu wants to merge 1 commit intoverl-project:mainfrom
Luosuu:tianle/add-dpskv3-moe-param-map

Conversation

@Luosuu
Copy link
Copy Markdown
Collaborator

@Luosuu Luosuu commented Apr 14, 2026

What does this PR do?

Add deepseek_v3 entry to MOE_PARAM_HANDERS mapping in verl/workers/engine/veomni/utils.py, reusing the existing _map_moe_params_qwen3_moe handler since DeepSeek-V3 shares the same MoE parameter mapping logic as Qwen3-MoE.

Checklist Before Starting

Test

This is a configuration-level change (adding a key to a dictionary mapping). The mapping reuses an existing, tested handler function (_map_moe_params_qwen3_moe). No new logic is introduced.

API and Usage Example

No API changes. DeepSeek-V3 MoE models will now be automatically handled by the veomni engine during parameter mapping.

Design & Code Changes

  • Added "deepseek_v3": _map_moe_params_qwen3_moe to the MOE_PARAM_HANDERS dict in verl/workers/engine/veomni/utils.py.

Checklist Before Submitting

  • Read the Contribute Guide.
  • Apply pre-commit checks.
  • Add / Update the documentation. (N/A - no doc changes needed for this config addition)
  • Add unit or end-to-end test(s). Not feasible: this is a single-line dictionary entry addition reusing an existing handler; no new logic to test.
  • Once your PR is ready for CI, send a message in the ci-request channel.
  • If your PR is related to the recipe submodule, please also update the reference. (N/A)

@Luosuu
Copy link
Copy Markdown
Collaborator Author

Luosuu commented Apr 14, 2026

cc @pengwu22 @wuxibin89

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the deepseek_v3 model by mapping it to the existing qwen3_moe parameter handler. However, feedback indicates that reusing this handler propagates an existing bug where parameter suffixes are redundantly appended, resulting in incorrect mapping keys like '.weight.weight'.

@pengwu22
Copy link
Copy Markdown
Collaborator

It’s verified compatible with veomni, right?

@Luosuu
Copy link
Copy Markdown
Collaborator Author

Luosuu commented Apr 14, 2026

It’s verified compatible with veomni, right?

yes, verified by end-to-end moonlight training

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants