Skip to content

fix(deepseek_v4): drop full-attention sharding for MoE-only strategy

feefe18
Select commit
Loading
Failed to load commit list.
Open

fix(deepseek_v4): drop full-attention sharding for MoE-only strategy #1996

fix(deepseek_v4): drop full-attention sharding for MoE-only strategy
feefe18
Select commit
Loading
Failed to load commit list.

Workflow runs completed with no jobs