Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] fix setting of CUDA architectures and enable support for NVIDIA Blackwell #6812

Merged
merged 18 commits into from
Feb 2, 2025
Merged
2 changes: 1 addition & 1 deletion .github/workflows/cuda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ jobs:
- method: wheel
compiler: gcc
python_version: "3.11"
cuda_version: "12.6.1"
cuda_version: "12.8.0"
linux_version: "ubuntu22.04"
task: cuda
- method: source
Expand Down
23 changes: 15 additions & 8 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -224,21 +224,28 @@ if(USE_CUDA)
# reference for mapping of CUDA toolkit component versions to supported architectures ("compute capabilities"):
# https://en.wikipedia.org/wiki/CUDA#GPUs_supported
set(CUDA_ARCHS "60" "61" "62" "70" "75")
if(CUDA_VERSION VERSION_GREATER_EQUAL "110")
Copy link
Collaborator Author

@StrikerRUS StrikerRUS Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"110" means exactly 110 version during comparison, VERSION_GREATER_EQUAL doesn't know whether and where we want to put a .: 11.0 or maybe 1.10.

if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "11.0")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we call FindCUDAToolkit but not FindCUDA, we get undefined CUDA_VERSION variable.

find_package(CUDAToolkit 11.0 REQUIRED)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent fix, thank you!

list(APPEND CUDA_ARCHS "80")
endif()
if(CUDA_VERSION VERSION_GREATER_EQUAL "111")
if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "11.1")
list(APPEND CUDA_ARCHS "86")
endif()
if(CUDA_VERSION VERSION_GREATER_EQUAL "115")
if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "11.5")
list(APPEND CUDA_ARCHS "87")
endif()
if(CUDA_VERSION VERSION_GREATER_EQUAL "118")
if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "11.8")
list(APPEND CUDA_ARCHS "89")
list(APPEND CUDA_ARCHS "90")
endif()
if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "12.8")
list(APPEND CUDA_ARCHS "100")
list(APPEND CUDA_ARCHS "120")
endif()
# Generate PTX for the most recent architecture for forwards compatibility
list(POP_BACK CUDA_ARCHS CUDA_LAST_SUPPORTED_ARCH)
list(APPEND CUDA_ARCHS "${CUDA_LAST_SUPPORTED_ARCH}+PTX")
list(TRANSFORM CUDA_ARCHS APPEND "-real")
list(APPEND CUDA_ARCHS "${CUDA_LAST_SUPPORTED_ARCH}-real" "${CUDA_LAST_SUPPORTED_ARCH}-virtual")
Comment on lines +246 to +247
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the following error:

[33/70] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o
FAILED: CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o 
/usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++ -DEIGEN_DONT_PARALLELIZE -DEIGEN_MPL2_ONLY -DMM_MALLOC -DMM_PREFETCH -DUSE_CUDA -DUSE_SOCKET -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/eigen -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/fast_double_parser/include -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/fmt/include -I/usr/local/cuda/targets/x86_64-linux/include -I/__w/LightGBM/LightGBM/lightgbm-python/include -Xcompiler=-fopenmp -Xcompiler=-fPIC -Xcompiler=-Wall -O3 -lineinfo -O3 -DNDEBUG -std=c++11 "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_62,code=[compute_62,sm_62]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_100,code=[compute_100,sm_100]" "--generate-code=arch=compute_120+PTX,code=[compute_120+PTX,sm_120+PTX]" -MD -MT CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o -MF CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o.d -x cu -rdc=true -c /__w/LightGBM/LightGBM/lightgbm-python/src/boosting/cuda/cuda_score_updater.cu -o CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o
nvcc fatal   : Unsupported gpu architecture 'compute_120+PTX'

Borrowed from XGBoost:
https://github.com/dmlc/xgboost/blob/a46585a36c4bf30bfd58a2653fe8ae40beea25ce/cmake/Utils.cmake#L73-L74

https://github.com/dmlc/xgboost/actions/runs/13056819151/job/36429848864#step:6:162

message(STATUS "CUDA_ARCHITECTURES: ${CUDA_ARCHS}")
if(USE_DEBUG)
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -g")
else()
Expand Down Expand Up @@ -567,22 +574,22 @@ if(USE_CUDA)
set_target_properties(
lightgbm_objs
PROPERTIES
CUDA_ARCHITECTURES ${CUDA_ARCHS}
CUDA_ARCHITECTURES "${CUDA_ARCHS}"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prevent the following error:

-- Using _mm_malloc
CMake Error at CMakeLists.txt:572 (set_target_properties):
  set_target_properties called with incorrect number of arguments.


CMake Error at CMakeLists.txt:579 (set_target_properties):
  set_target_properties called with incorrect number of arguments.

CUDA_ARCHS were passed in the following form: 60616270758086878990100120+PTX.

CUDA_SEPARABLE_COMPILATION ON
)

set_target_properties(
_lightgbm
PROPERTIES
CUDA_ARCHITECTURES ${CUDA_ARCHS}
CUDA_ARCHITECTURES "${CUDA_ARCHS}"
CUDA_RESOLVE_DEVICE_SYMBOLS ON
)

if(BUILD_CLI)
set_target_properties(
lightgbm
PROPERTIES
CUDA_ARCHITECTURES ${CUDA_ARCHS}
CUDA_ARCHITECTURES "${CUDA_ARCHS}"
CUDA_RESOLVE_DEVICE_SYMBOLS ON
)
endif()
Expand Down
Loading