forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CK GEMM Backend #1480
Draft
alugorey
wants to merge
58
commits into
ROCm:rocm6.3_internal_testing
Choose a base branch
from
alugorey:ck_gemm_backend
base: rocm6.3_internal_testing
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
CK GEMM Backend #1480
+1,802
−221
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* changes to build Centos stream 9 images * Added scripts for centos and centos stream images * Added an extra line * Add ninja installation * Optimized code * Fixes * Add comment * Optimized code * Added AMDGPU mapping for ROCm 5.2 and invalid-url for rocm_baseurl Co-authored-by: Jithun Nair <[email protected]>
- Rocblas API support is requested - SWDEV-383635 & sub task - SWDEV-390218
* Add hip_basic tensorpipe support to PyTorch * Enabling hip_basic for Tensorpipe for pyTorch * removing upstream tensorpipe module * Adding ROCm specific tensopipe submodule * tensorpipe submodule updated * Update the hip invalid device string * Added ignore for tensorpipe git submodule * Moved include of tensorpipe_cuda.h to hipify * Updates based on review comments * Defining the variable __HIP_PLATFORM_AMD__ * Enabling the UTs Co-authored-by: Ronak Malik <[email protected]>
- Fortran package installation moved after gcc - Update libtinfo search code in cmake1 - Install libstdc++.so
To resolve https://ontrack-internal.amd.com/browse/SWDEV-403530 and https://ontrack-internal.amd.com/browse/SWDEV-419837. For more context check upstream issue pytorch#111834
Reversed the condition as required
- Add missing common_utils.sh - Update the install vision part - Move to amdgpu rhel 9.3 builds - Update to pick python from conda path - Add a missing package - Add ROCM_PATH and magma - Updated repo radeon path
This also fixes a problem in gesvd driver when UV is not needed.
- build_environment is hard coded to value from upstream when branch for created, since the dev/QA ENV build_environment value can be varing
* Fix the parsing of /etc/os-release The old code parses OS_DISTRO as 'PRETTY_Ubuntu' on Ubuntu and thus never links to libtinfo correctly. * Configurable CMAKE_PREFIX_PATH in CI script.
- This is done as per QA request, needs to be reverted and not required to be cherry-picked into later releases.
* Moved NAVI check to the test file * Revised NAVI check as a function
* Running triton kernel on ROCM only has one GB/s metric reported * Update test_kernel_benchmark.py
…m#1386) * Initial implementation of PyTorch ut parsing script * Extracted path variables * Use nested dict to save results * Fixes typo * Cleanup * Fixes several issues * Minor name change * Update run_pytorch_unit_tests.py * Added file banners * Supported running from API * Added more help info * Consistent naming * Format help text --------- Co-authored-by: Jithun Nair <[email protected]> Co-authored-by: Jithun Nair <[email protected]>
- PYTORCH_EXTRA_INSTALL_REQUIREMENTS is set in builder repo - Remove the PYTORCH_EXTRA_INSTALL_REQUIREMENTS step from this file
- Causing regression - SWDEV-463083
* Fix SWDEV-459623. The Rank of logsumexp Tensor must be 3. This tensor was considered for internal use only but apparently exposed to UTs. * Fix for mGPU. The stream should be selected after picking the current device according to input tensor.
* Add formal FP8 check in common_cuda.py * Enable inductor/test_valid_cast * Support for test_eager_fallback * allow fnuz types on amax test * Finalize passing tests vs failing * Fix fnuz constants in _to_fp8_saturated
* Enable batchnorm NHWC for MIOpen * cleanup * test to compare NHWC MIOpen batchnorm with CPU * fix 'use_miopen' condition for nhwc miopen * fix includes * use native nhwc batchnorm to verify miopen * remove extra spaces * remove empty lines * set PYTORCH_MIOPEN_SUGGEST_NHWC=1 for all test_nn.py test
…OCm#1433) * Print consolidated log file for pytorch uts * Update run_entire_tests subprocess call as well * lint * Add ERROR string
* Initial commit to port intra_node_comm to ROCm (cherry picked from commit 48d1c33) * gpt-fast running now with intra-node comm (cherry picked from commit 618c54e) --------- Co-authored-by: Prachi Gupta <[email protected]>
Co-authored-by: Jithun Nair <[email protected]>
IFU for rocm6.3_internal_testing
9ae24a7
to
12b4a67
Compare
Jenkins build for 1b6b84ecf382b55ed398c6d89714363da20a59f5 commit finished as FAILURE |
Jenkins build for 1b6b84ecf382b55ed398c6d89714363da20a59f5 commit finished as FAILURE |
Jenkins build for 1b6b84ecf382b55ed398c6d89714363da20a59f5 commit finished as FAILURE |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Porting recent ck gemm backend changes to ROCm