Add FP/INT triton kernels and unit tests, also update QAT example #58

chichun-charlie-liu · 2025-01-27T19:21:09Z

Description of the change

In addition to CUTLASS kernel, we added a new Triton matmul kernel that supports FP32, FP16, BF16, FP8, and INT8. Triton kernel is more flexible and easier hackable than CUTLASS. Although the INT8 performance of this triton kernel is only on par with FP16 torch.matmul (CUTLASS is ~2x faster), triton provides a valuable path to study HW behaviors and detailed simulations. For example, we can apply truncation on accumulator in a more efficient way than serial torch.matmul and much cleaner codes compared to writing in CUTLASS.

Related issue number

There is a compatibility issue with our existing CUTLASS kernel and torch.compile(..., mode=reduced-overhead), which is blocking us from advancing from PyTorch 2.3 to PyTorch 2.4. With the addition of the new Triton kernel, at least there is an alternative run path for the entire INT8 QAT example (including the lowering part) when using PyTorch 2.4.

How to verify the PR

INT8 QAT example has a lowering option which previously only supported CUTLASS. With the newly added Triton kernel, we have a second option to run quantized model using real INT engine now.

Note that in this model lowering experiment, quantized model will pass real INT8 tensor to matmul operator. Therefore, the kernel needs to be able to:

accept INT8 tensors (torch.matmul do not accept INT8 inputs yet),
use the INT engine in the best way it can
return from INT8 matmul is a INT32 tensor. Dequantization will be performed afterward.

Was the PR tested

I have added >=1 unit test(s) for every new method I have added.
I have ensured all unit tests pass

Signed-off-by: cliu-us <[email protected]>

Signed-off-by: chichun-charlie-liu <[email protected]>

Signed-off-by: cliu-us <[email protected]>

.gitignore

examples/QAT_INT8/run_qa_no_trainer_qat.py

fms_mo/custom_ext_kernels/triton_kernels.py

fms_mo/modules/linear.py

Signed-off-by: cliu-us <[email protected]>

tharapalanivel

LGTM, thanks @chichun-charlie-liu!

add fp/int triton kernels and unit tests

985886f

Signed-off-by: cliu-us <[email protected]>

chichun-charlie-liu requested review from andrea-fasoli, kcirred, nwang-ibm and tharapalanivel as code owners January 27, 2025 19:21

chichun-charlie-liu and others added 6 commits January 27, 2025 19:24

spell check

eb9a25f

Signed-off-by: cliu-us <[email protected]>

fix linting

f25a905

Signed-off-by: cliu-us <[email protected]>

default to pt2.4

92a539e

Signed-off-by: cliu-us <[email protected]>

Merge branch 'main' into triton-kernel

5ac2dd1

Signed-off-by: chichun-charlie-liu <[email protected]>

disable triton test if no GPU available

bc9155d

Signed-off-by: cliu-us <[email protected]>

minor changes per Derrick's feedback

3a89c7b

Signed-off-by: cliu-us <[email protected]>

tharapalanivel reviewed Jan 29, 2025

View reviewed changes

modified based on Thara's feedback

59dfc8b

Signed-off-by: cliu-us <[email protected]>

tharapalanivel approved these changes Jan 30, 2025

View reviewed changes

chichun-charlie-liu merged commit 9301123 into foundation-model-stack:main Jan 30, 2025
11 checks passed

chichun-charlie-liu deleted the triton-kernel branch January 30, 2025 21:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add FP/INT triton kernels and unit tests, also update QAT example #58

Add FP/INT triton kernels and unit tests, also update QAT example #58

Uh oh!

chichun-charlie-liu commented Jan 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tharapalanivel left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add FP/INT triton kernels and unit tests, also update QAT example #58

Add FP/INT triton kernels and unit tests, also update QAT example #58

Uh oh!

Conversation

chichun-charlie-liu commented Jan 27, 2025

Description of the change

Related issue number

How to verify the PR

Was the PR tested

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tharapalanivel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants