-
Notifications
You must be signed in to change notification settings - Fork 665
[ET-VK] Introduce custom op correctness + speed testing suite & add vulkan operator testing to CI #13835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…upgrade glslc Pull Request resolved: #13814 ## Motivation Prepare for shaders that will use accelerated int8 dot product GLSL extensions, i.e. `dotPacked4x8AccSatEXT` ## Changes * Query for support for the shader integer dot product extension when creating the VkPhysicalDevice * Request the shader integer dot product extension when creating VkDevice * Provide APIs to check if the extension is available in the current runtime. ghstack-source-id: 306632732 @exported-using-ghexport Differential Revision: [D81323427](https://our.internmc.facebook.com/intern/diff/D81323427/)
…ulkan operator testing to CI Pull Request resolved: #13815 ## Motivation Provide an easy way to test and benchmark custom operators when developing them. ## Changes Introduces a custom op test suite under `backends/vulkan/test/custom_ops`. Each operator will have its own test file, as seen in the next diff. `utils.[h|cpp]` define common utilities that can be used across test files. To facilitate prototyping, prototype shaders and C++ host code can be placed under the `impl/` and `glsl` folders. Output of the test binary looks like: ``` === Compute Shader Performance Benchmark === Add Operation Prototyping Framework ---------------------------------------------------------------------- Executing 32 test cases for Add ---------------------------------------------------------------------- Add_1x64x64_Texture3D_Float [1x64x64] 3.094 μs 1.324 GFLOP/s PASSED Add_1x64x64_Texture3D_Half [1x64x64] 2.574 μs 1.591 GFLOP/s SKIPPED Add_1x64x64_Buffer_Float [1x64x64] 3.084 μs 1.328 GFLOP/s PASSED Add_1x64x64_Buffer_Half [1x64x64] 2.668 μs 1.535 GFLOP/s SKIPPED Add_1x128x128_Texture3D_Float [1x128x128] 6.001 μs 2.730 GFLOP/s PASSED Add_1x128x128_Texture3D_Half [1x128x128] 4.004 μs 4.092 GFLOP/s SKIPPED Add_1x128x128_Buffer_Float [1x128x128] 6.074 μs 2.698 GFLOP/s PASSED Add_1x128x128_Buffer_Half [1x128x128] 5.112 μs 3.205 GFLOP/s SKIPPED Add_1x256x256_Texture3D_Float [1x256x256] 17.852 μs 3.671 GFLOP/s PASSED Add_1x256x256_Texture3D_Half [1x256x256] 10.057 μs 6.517 GFLOP/s SKIPPED Add_1x256x256_Buffer_Float [1x256x256] 19.027 μs 3.444 GFLOP/s PASSED Add_1x256x256_Buffer_Half [1x256x256] 15.330 μs 4.275 GFLOP/s SKIPPED Add_1x512x512_Texture3D_Float [1x512x512] 48.292 μs 5.428 GFLOP/s PASSED Add_1x512x512_Texture3D_Half [1x512x512] 26.832 μs 9.770 GFLOP/s SKIPPED Add_1x512x512_Buffer_Float [1x512x512] 48.828 μs 5.369 GFLOP/s PASSED Add_1x512x512_Buffer_Half [1x512x512] 48.308 μs 5.427 GFLOP/s SKIPPED Add_1x1x1024_Texture3D_Float [1x1x1024] 2.376 μs 0.431 GFLOP/s PASSED Add_1x1x1024_Texture3D_Half [1x1x1024] 2.215 μs 0.462 GFLOP/s SKIPPED Add_1x1x1024_Buffer_Float [1x1x1024] 2.402 μs 0.426 GFLOP/s PASSED Add_1x1x1024_Buffer_Half [1x1x1024] 2.304 μs 0.445 GFLOP/s SKIPPED Add_1x1024x1_Texture3D_Float [1x1024x1] 6.120 μs 0.167 GFLOP/s PASSED Add_1x1024x1_Texture3D_Half [1x1024x1] 6.245 μs 0.164 GFLOP/s SKIPPED Add_1x1024x1_Buffer_Float [1x1024x1] 2.392 μs 0.428 GFLOP/s PASSED Add_1x1024x1_Buffer_Half [1x1024x1] 2.304 μs 0.445 GFLOP/s SKIPPED Add_32x32x32_Texture3D_Float [32x32x32] 10.249 μs 3.197 GFLOP/s PASSED Add_32x32x32_Texture3D_Half [32x32x32] 6.583 μs 4.978 GFLOP/s SKIPPED Add_32x32x32_Buffer_Float [32x32x32] 10.468 μs 3.130 GFLOP/s PASSED Add_32x32x32_Buffer_Half [32x32x32] 8.481 μs 3.864 GFLOP/s SKIPPED Add_16x128x64_Texture3D_Float [16x128x64] 26.000 μs 5.041 GFLOP/s PASSED Add_16x128x64_Texture3D_Half [16x128x64] 17.841 μs 7.347 GFLOP/s SKIPPED Add_16x128x64_Buffer_Float [16x128x64] 28.917 μs 4.533 GFLOP/s PASSED Add_16x128x64_Buffer_Half [16x128x64] 28.792 μs 4.552 GFLOP/s SKIPPED ``` `SKIPPED` means that correctness checking is not performed on that test case. This usually happens in one of the following cases: * Input/output dtype is fp16. There is no fp16 dtype support in reference calculation functions * Input sizes are too big. Since reference calculation functions are implemented in a naive manner, calculating reference data may take too long for large inputs. Larger test cases are usually meant to tests performance, not correctness. ghstack-source-id: 306632731 @exported-using-ghexport Differential Revision: [D81323426](https://our.internmc.facebook.com/intern/diff/D81323426/)
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13835
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 6 New Failures, 19 PendingAs of commit 760d3a5 with merge base e2098f8 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
SS-JIA
approved these changes
Aug 30, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: vulkan
Issues related to the Vulkan delegate and code under backends/vulkan/
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #13815 by @SS-JIA
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/315/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/315/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/314/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/315/orig
@diff-train-skip-merge
cc @SS-JIA @manuelcandales @cbilgin