Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Distopt with offload
#1573 opened Mar 13, 2025 by sanandaraj5597 Loading…
[QA] Add error handling
#1570 opened Mar 13, 2025 by linxiddd Loading…
11 tasks
[JAX] Unbalanced Context Parallelism with THD format
#1565 opened Mar 12, 2025 by zlsh80826 Loading…
8 of 13 tasks
Draft: split wgrad for GroupedLinear
#1564 opened Mar 12, 2025 by lhb8125 Draft
13 tasks
[CI] Add isort
#1563 opened Mar 12, 2025 by yaox12 Draft
13 tasks
Enable AttnFuncWithCPAndKVP2P to support mla
#1561 opened Mar 12, 2025 by SuperCB Loading…
3 of 13 tasks
[PyTorch] Debug MXFP8 norms bug Something isn't working
#1560 opened Mar 12, 2025 by timmoon10 Draft
6 of 13 tasks
Blockwise scaling linear quantization recipe
#1559 opened Mar 11, 2025 by kwyss-nvidia Loading…
8 of 13 tasks
[PyTorch] Support Bgrad Cast FP8 Fusion for FP8 Current Scaling Recipe
#1558 opened Mar 11, 2025 by zhongbozhu Loading…
9 of 15 tasks
[PyTorch] Support TP Overlap in Per-Tensor Current Scaling Recipe
#1554 opened Mar 10, 2025 by BestJuly Loading…
10 of 16 tasks
change softmax_lse correction of CP to FP32
#1546 opened Mar 7, 2025 by xrennvidia Loading…
6 of 13 tasks
Subchannel Block quantized GEMM
#1545 opened Mar 6, 2025 by kwyss-nvidia Loading…
6 of 12 tasks
[PyTorch] Enable fp8_primary_weights for current scaling
#1544 opened Mar 6, 2025 by kunlunl Loading…
5 of 13 tasks
Refactoring attention.py part 1 2.2.0
#1542 opened Mar 6, 2025 by KshitijLakhani Loading…
6 of 13 tasks
Fused Linear and Cross Entropy operations
#1537 opened Mar 5, 2025 by Jianbing-D Loading…
Parallelize CPU reference implementation in tests testing Improvements to tests or testing infrastructure
#1534 opened Mar 4, 2025 by negvet Draft
8 of 14 tasks
Blackwell devel commoverlap mlperftests
#1529 opened Feb 28, 2025 by vasunvidia Loading…
12 tasks
[MoE] Enable MXFP8 and Per-Tensor Current Scaling for Grouped Linear
#1525 opened Feb 28, 2025 by yaox12 Loading…
5 of 17 tasks
Blockwise float8 quantizer and quantized tensor class
#1513 opened Feb 27, 2025 by kwyss-nvidia Loading…
23 of 34 tasks
Draft: split wgrad poc
#1510 opened Feb 26, 2025 by lhb8125 Draft
13 tasks
Support tensors with only column-wise data enhancement New feature or request performance
#1505 opened Feb 25, 2025 by timmoon10 Loading…
7 of 13 tasks
[Pytorch] Dynamo ONNX export support
#1497 opened Feb 19, 2025 by pggPL Loading…
8 of 13 tasks
RoPE enhancements 2.3.0
#1478 opened Feb 11, 2025 by sudhakarsingh27 Loading…
3 of 6 tasks
support adam bf16 state
#1465 opened Feb 8, 2025 by XiaobingSuper Loading…
6 of 13 tasks
ProTip! Updated in the last three days: updated:>2025-03-10.