You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[PyTorch] Add max_logit support for MuonClip (#2195)
* add max_score for fused/unfused F16 non-CP
Signed-off-by: Charlene Yang <[email protected]>
* calculate max per head instead of max over all heads
Signed-off-by: Charlene Yang <[email protected]>
* fix fused attn max_score shape
Signed-off-by: Charlene Yang <[email protected]>
* revert FE to github
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update FE to 1.15.0-rc
Signed-off-by: Charlene Yang <[email protected]>
* fix merge
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* reduce ew kernels; fix causal masks; add more tests
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* minor fix to tests
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove logic for flash-attn
Signed-off-by: Charlene Yang <[email protected]>
* WIP: add CP support for p2p/a2a/all_gather
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* minor improvements of implementation/tests
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* WIP: add thd support
Signed-off-by: Charlene Yang <[email protected]>
* add thd to UnfusedDPA
Signed-off-by: Charlene Yang <[email protected]>
* fix lint
Signed-off-by: Charlene Yang <[email protected]>
* more fixes for lint
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update to FE 1.15
Signed-off-by: Charlene Yang <[email protected]>
* remove unneeded changes
Signed-off-by: Charlene Yang <[email protected]>
* disable unfused for thd + pad_between_seqs
Signed-off-by: Charlene Yang <[email protected]>
* minor fixes
Signed-off-by: Charlene Yang <[email protected]>
* disable thd for unfused until bug is fixed
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix all_gather
Signed-off-by: Charlene Yang <[email protected]>
* fix all gather
Signed-off-by: Charlene Yang <[email protected]>
* rename max_score to max_logit
Signed-off-by: Charlene Yang <[email protected]>
* fix all_gather
Signed-off-by: Charlene Yang <[email protected]>
* fix all_gather
Signed-off-by: Charlene Yang <[email protected]>
* disable fused attn + thd
Signed-off-by: Charlene Yang <[email protected]>
---------
Signed-off-by: Charlene Yang <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
0 commit comments