try fit in new transformers moe by ETOgaosion · Pull Request #110 · ISEEKYAN/mbridge

ETOgaosion · 2026-03-30T15:42:59Z

transformers>5.0 changes All MoE weights format, now there is a large expert gate_up tensor and down tensor:

https://github.com/huggingface/transformers/blob/aad13b87ed59f2afcfaebc985f403301887a35fc/src/transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py#L820-L821

        self.gate_up_proj = nn.Parameter(torch.empty(self.num_experts, 2 * self.intermediate_dim, self.hidden_dim))
        self.down_proj = nn.Parameter(torch.empty(self.num_experts, self.hidden_dim, self.intermediate_dim))

ETOgaosion · 2026-03-30T16:32:03Z

Now we support: loading legacy model save checkpoints with experts.x.gate_up_proj (according to actual key, also support new API)

And saving models with new state_dict() API with experts.gate_up_proj (according to transformers_version in transformer configs, also support legacy API)

ETOgaosion added 5 commits March 30, 2026 23:42

try fit in new transformers moe

9e9532c

try fit in new transformers moe

3b7bb21

try fit in new transformers moe

c5774b9

try fit in new transformers moe

9ae75a6

try fit in new transformers moe

04fe4d5

ETOgaosion mentioned this pull request Mar 30, 2026

[docker, ci] fix: all CIs, transformers upgrade to 5.3.0 and vllm==0.18.0 verl-project/verl#5724

Merged

8 tasks

ETOgaosion added 2 commits March 31, 2026 00:28

fix more models

8372ae1

remove useless debug code

115428c

ETOgaosion added 9 commits March 31, 2026 00:36

fix hard code

7f2791b

remove redundant API

8956cf6

fix API inconsistency

6fff215

fix redundant codes

d909f11

fix redundant codes

3177431

fix redundant codes

75575f3

fix redundant codes

3cfd931

fix redundant codes

844c7db

make a func better understand

a938895

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

try fit in new transformers moe#110

try fit in new transformers moe#110
ETOgaosion wants to merge 16 commits intoISEEKYAN:mainfrom
ETOgaosion:fix/transformers_moe

ETOgaosion commented Mar 30, 2026 •

edited

Loading

Uh oh!

ETOgaosion commented Mar 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ETOgaosion commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ETOgaosion commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ETOgaosion commented Mar 30, 2026 •

edited

Loading

ETOgaosion commented Mar 30, 2026 •

edited

Loading