Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add preliminary Muon+M-FSDP support
#4486 opened Apr 27, 2026 by janEbert Contributor Draft
Standardize misc graph interface complexity: medium
#4485 opened Apr 27, 2026 by tdene Contributor Loading…
5 tasks
Core 0.16
Checkpoint conversion between GPT_model and Hybrid_model
#4482 opened Apr 27, 2026 by guihong-nv Contributor Draft
1 of 5 tasks
[dev] [DeepSeek-v4] Part 2: Hash MoE, SwiGLU clamp, and new mHC contract dev branch Dev branch related issues and development
#4481 opened Apr 27, 2026 by hxbai Contributor Draft
5 tasks
feat(attention): Add attention_per_head_gate and rotary_base_per_laye…
#4473 opened Apr 26, 2026 by shifangx Contributor Loading…
5 tasks
Add Hybrid Transformer block fusion complexity: high
#4463 opened Apr 24, 2026 by janEbert Contributor Loading… Core 0.16
ci: Fix event name reference in CI workflow condition for merge group Approved All necessary approvals have been made complexity: low
#4462 opened Apr 24, 2026 by balasaajay Contributor Loading…
5 tasks
Core 0.16
Fused add rmsnorm
#4459 opened Apr 24, 2026 by wdykas Contributor Draft
5 tasks
Core 0.16
[dev] [DeepSeek-v4] Part 1: Hybrid Attention with CSA and HCA dev branch Dev branch related issues and development
#4458 opened Apr 24, 2026 by hxbai Contributor Draft
5 tasks
ProTip! Follow long discussions with comments:>50.