Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] fix setting of CUDA architectures and enable support for NVIDIA Blackwell #6812

Merged
merged 18 commits into from
Feb 2, 2025

Conversation

StrikerRUS
Copy link
Collaborator

@@ -224,19 +224,23 @@ if(USE_CUDA)
# reference for mapping of CUDA toolkit component versions to supported architectures ("compute capabilities"):
# https://en.wikipedia.org/wiki/CUDA#GPUs_supported
set(CUDA_ARCHS "60" "61" "62" "70" "75")
if(CUDA_VERSION VERSION_GREATER_EQUAL "110")
if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "11.0")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we call FindCUDAToolkit but not FindCUDA, we get undefined CUDA_VERSION variable.

find_package(CUDAToolkit 11.0 REQUIRED)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent fix, thank you!

@@ -224,19 +224,23 @@ if(USE_CUDA)
# reference for mapping of CUDA toolkit component versions to supported architectures ("compute capabilities"):
# https://en.wikipedia.org/wiki/CUDA#GPUs_supported
set(CUDA_ARCHS "60" "61" "62" "70" "75")
if(CUDA_VERSION VERSION_GREATER_EQUAL "110")
Copy link
Collaborator Author

@StrikerRUS StrikerRUS Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"110" means exactly 110 version during comparison, VERSION_GREATER_EQUAL doesn't know whether and where we want to put a .: 11.0 or maybe 1.10.

set_target_properties(
lightgbm_objs
PROPERTIES
CUDA_ARCHITECTURES ${CUDA_ARCHS}
CUDA_ARCHITECTURES "${CUDA_ARCHS}"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prevent the following error:

-- Using _mm_malloc
CMake Error at CMakeLists.txt:572 (set_target_properties):
  set_target_properties called with incorrect number of arguments.


CMake Error at CMakeLists.txt:579 (set_target_properties):
  set_target_properties called with incorrect number of arguments.

CUDA_ARCHS were passed in the following form: 60616270758086878990100120+PTX.

Comment on lines +246 to +247
list(TRANSFORM CUDA_ARCHS APPEND "-real")
list(APPEND CUDA_ARCHS "${CUDA_LAST_SUPPORTED_ARCH}-real" "${CUDA_LAST_SUPPORTED_ARCH}-virtual")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the following error:

[33/70] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o
FAILED: CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o 
/usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++ -DEIGEN_DONT_PARALLELIZE -DEIGEN_MPL2_ONLY -DMM_MALLOC -DMM_PREFETCH -DUSE_CUDA -DUSE_SOCKET -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/eigen -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/fast_double_parser/include -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/fmt/include -I/usr/local/cuda/targets/x86_64-linux/include -I/__w/LightGBM/LightGBM/lightgbm-python/include -Xcompiler=-fopenmp -Xcompiler=-fPIC -Xcompiler=-Wall -O3 -lineinfo -O3 -DNDEBUG -std=c++11 "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_62,code=[compute_62,sm_62]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_100,code=[compute_100,sm_100]" "--generate-code=arch=compute_120+PTX,code=[compute_120+PTX,sm_120+PTX]" -MD -MT CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o -MF CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o.d -x cu -rdc=true -c /__w/LightGBM/LightGBM/lightgbm-python/src/boosting/cuda/cuda_score_updater.cu -o CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o
nvcc fatal   : Unsupported gpu architecture 'compute_120+PTX'

Borrowed from XGBoost:
https://github.com/dmlc/xgboost/blob/a46585a36c4bf30bfd58a2653fe8ae40beea25ce/cmake/Utils.cmake#L73-L74

https://github.com/dmlc/xgboost/actions/runs/13056819151/job/36429848864#step:6:162

@@ -32,7 +32,7 @@ if [ "$PY_MINOR_VER" -gt 7 ]; then
--inspect \
--ignore 'compiled-objects-have-debug-symbols'\
--ignore 'distro-too-large-compressed' \
--max-allowed-size-uncompressed '100M' \
--max-allowed-size-uncompressed '120M' \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid

------------ check results -----------
1. [distro-too-large-uncompressed] Uncompressed size 0.1G is larger than the allowed size (100.0M).
errors found while checking: 1

@StrikerRUS StrikerRUS marked this pull request as ready for review February 2, 2025 15:55
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you so much!!

@jameslamb
Copy link
Collaborator

I'm really happy we were able to get Blackwell support into the next release 😁

@jameslamb jameslamb merged commit c9de57b into master Feb 2, 2025
49 checks passed
@jameslamb jameslamb deleted the ci/cuda branch February 2, 2025 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants