-
Notifications
You must be signed in to change notification settings - Fork 252
Bf16*fp4 gemm #2801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
eliotwang
wants to merge
56
commits into
ROCm:develop
Choose a base branch
from
eliotwang:bf16_fp4_gemm
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,262
−89
Open
Bf16*fp4 gemm #2801
Changes from 53 commits
Commits
Show all changes
56 commits
Select commit
Hold shift + click to select a range
d1bf200
support bf16*mxfp4 gemm
4e205c4
rebase bf16*fp4 example to develop branch
k50112113 52c5ed5
Clean up commented debug code in GEMM kernel
eliotwang e1d0365
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 43db1f7
Merge branch 'develop' into bf16_fp4_gemm
eliotwang ff89459
rename example folder
1409d62
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 637f2e8
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 30450e3
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 291e36b
Merge branch 'develop' into bf16_fp4_gemm
eliotwang d2c79f8
Merge branch 'develop' into bf16_fp4_gemm
eliotwang ba84541
Merge branch 'develop' into bf16_fp4_gemm
eliotwang e31f9df
Merge branch 'develop' into bf16_fp4_gemm
eliotwang f2c0d77
Merge branch 'develop' into bf16_fp4_gemm
eliotwang f65c005
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 78ae8aa
support bf16*mxfp4 gemm
e953070
rebase bf16*fp4 example to develop branch
k50112113 9d01db5
Clean up commented debug code in GEMM kernel
eliotwang 28d4d24
rename example folder
8229d64
rebase to new develop
k50112113 23c89b3
rebase to new develop
k50112113 ec53824
Merge branch 'develop' into bf16_fp4_gemm
eliotwang adb4bc3
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 5304448
Merge branch 'develop' into bf16_fp4_gemm
illsilin 75dbf17
fix clang format
illsilin 3efca0f
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 8aec6b9
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 5e91e9e
Merge branch 'develop' into bf16_fp4_gemm
eliotwang cba5ab1
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 5697816
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 8f272b3
Merge branch 'develop' into bf16_fp4_gemm
illsilin 8bad07a
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 03406c0
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 984ed9f
Merge remote-tracking branch 'upstream/develop' into bf16_fp4_gemm
eliotwang 46ff36a
update code according to reviewer's comment
eliotwang 01f5c75
Update README.md
eliotwang 92d7082
update code according to reviewer's comment
eliotwang 7e32cb9
Merge branch 'bf16_fp4_gemm' of https://github.com/eliotwang/heyi_com…
eliotwang 48e6393
update code according to reviewer's comment
eliotwang 88c6a8c
Update CMakeLists.txt
eliotwang 87c4e07
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 6883654
Merge remote-tracking branch 'upstream/develop' into bf16_fp4_gemm
eliotwang 3579741
Update README.md
eliotwang 9579e6f
Update CMakeLists.txt
eliotwang 8a4ac27
Delete files
eliotwang f6ffb76
Delete files
eliotwang 5225b4d
Merge branch 'develop' into bf16_fp4_gemm
eliotwang eb15154
Merge branch 'develop' into bf16_fp4_gemm
eliotwang ba12e7d
Merge branch 'develop' into bf16_fp4_gemm
eliotwang fde6e39
Add unit tests
eliotwang f54857f
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 7ceedeb
Merge branch 'develop' into bf16_fp4_gemm
eliotwang d5ce464
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 329c601
Update test_gemm_quant_base.hpp
eliotwang 8c75bc1
Merge branch 'develop' into bf16_fp4_gemm
eliotwang 7851abb
Merge branch 'develop' into bf16_fp4_gemm
eliotwang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
44 changes: 44 additions & 0 deletions
44
example/ck_tile/38_block_scale_gemm/gemm_bquant_quantgrouped_prefill_bf16mxfp4.cpp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| // SPDX-License-Identifier: MIT | ||
| // Copyright (c) , Advanced Micro Devices, Inc. All rights reserved. | ||
|
|
||
| #include "run_gemm_quant_example.inc" | ||
|
|
||
| template <typename T> | ||
| using GemmConfig = GemmConfigBQuantPrefill<T>; | ||
|
|
||
| #define RUN_GEMM_EXAMPLE_PREC_TYPE \ | ||
| run_gemm_example_prec_type<GemmConfig<ck_tile::pk_fp4_raw_t>, \ | ||
| TypeConfig, \ | ||
| QuantGroupSize, \ | ||
| ck_tile::QuantType::BQuantGrouped>(arg_parser); | ||
|
|
||
| void bquant_quantgrouped_bf16f4_instance_factory( | ||
| std::unordered_map<size_t, std::function<int(const ck_tile::ArgParser&)>>& lut) | ||
| { | ||
| using TypeConfig = decltype(GemmQuantTypeConfig<ck_tile::bf16_t, | ||
| ck_tile::pk_fp4_raw_t, | ||
| ck_tile::bf16_t, | ||
| ck_tile::pk_fp4_raw_t>{}); | ||
| #ifndef CK_GFX950_SUPPORT | ||
| lut[hash_multiple_strings({"bf16f4", "bquant", "non-preshuffleb", "1x1x32"})] = | ||
| [](const ck_tile::ArgParser& arg_parser) { | ||
| using QuantGroupSize = ck_tile::QuantGroupShape<ck_tile::sequence<1, 1, 32>>; | ||
| return RUN_GEMM_EXAMPLE_PREC_TYPE; | ||
| }; | ||
| #endif | ||
| lut[hash_multiple_strings({"bf16f4", "bquant", "non-preshuffleb", "1x1x32"})] = | ||
| [](const ck_tile::ArgParser& arg_parser) { | ||
| using QuantGroupSize = ck_tile::QuantGroupShape<ck_tile::sequence<1, 1, 32>>; | ||
| return RUN_GEMM_EXAMPLE_PREC_TYPE; | ||
| }; | ||
| lut[hash_multiple_strings({"bf16f4", "bquant", "non-preshuffleb", "1x1x64"})] = | ||
| [](const ck_tile::ArgParser& arg_parser) { | ||
| using QuantGroupSize = ck_tile::QuantGroupShape<ck_tile::sequence<1, 1, 64>>; | ||
| return RUN_GEMM_EXAMPLE_PREC_TYPE; | ||
| }; | ||
| lut[hash_multiple_strings({"bf16f4", "bquant", "non-preshuffleb", "1x1x128"})] = | ||
| [](const ck_tile::ArgParser& arg_parser) { | ||
| using QuantGroupSize = ck_tile::QuantGroupShape<ck_tile::sequence<1, 1, 128>>; | ||
| return RUN_GEMM_EXAMPLE_PREC_TYPE; | ||
| }; | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.