-
Notifications
You must be signed in to change notification settings - Fork 380
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[JAX] Unbalanced Context Parallelism with THD format
#1565
opened Mar 12, 2025 by
zlsh80826
Loading…
8 of 13 tasks
Enable AttnFuncWithCPAndKVP2P to support mla
#1561
opened Mar 12, 2025 by
SuperCB
Loading…
3 of 13 tasks
Blockwise scaling linear quantization recipe
#1559
opened Mar 11, 2025 by
kwyss-nvidia
Loading…
8 of 13 tasks
[PyTorch] Support Bgrad Cast FP8 Fusion for FP8 Current Scaling Recipe
#1558
opened Mar 11, 2025 by
zhongbozhu
Loading…
9 of 15 tasks
[PyTorch] Support TP Overlap in Per-Tensor Current Scaling Recipe
#1554
opened Mar 10, 2025 by
BestJuly
Loading…
10 of 16 tasks
change softmax_lse correction of CP to FP32
#1546
opened Mar 7, 2025 by
xrennvidia
Loading…
6 of 13 tasks
[PyTorch] Enable fp8_primary_weights for current scaling
#1544
opened Mar 6, 2025 by
kunlunl
Loading…
5 of 13 tasks
Refactoring attention.py part 1
2.2.0
#1542
opened Mar 6, 2025 by
KshitijLakhani
Loading…
6 of 13 tasks
Parallelize CPU reference implementation in tests
testing
Improvements to tests or testing infrastructure
[MoE] Enable MXFP8 and Per-Tensor Current Scaling for Grouped Linear
#1525
opened Feb 28, 2025 by
yaox12
Loading…
5 of 17 tasks
Blockwise float8 quantizer and quantized tensor class
#1513
opened Feb 27, 2025 by
kwyss-nvidia
Loading…
23 of 34 tasks
Support tensors with only column-wise data
enhancement
New feature or request
performance
#1505
opened Feb 25, 2025 by
timmoon10
Loading…
7 of 13 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2025-03-10.