-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CUDA] fix setting of CUDA architectures and enable support for NVIDIA Blackwell #6812
Conversation
@@ -224,19 +224,23 @@ if(USE_CUDA) | |||
# reference for mapping of CUDA toolkit component versions to supported architectures ("compute capabilities"): | |||
# https://en.wikipedia.org/wiki/CUDA#GPUs_supported | |||
set(CUDA_ARCHS "60" "61" "62" "70" "75") | |||
if(CUDA_VERSION VERSION_GREATER_EQUAL "110") | |||
if(CUDAToolkit_VERSION VERSION_GREATER_EQUAL "11.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we call FindCUDAToolkit but not FindCUDA, we get undefined CUDA_VERSION
variable.
Line 220 in 425395d
find_package(CUDAToolkit 11.0 REQUIRED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent fix, thank you!
@@ -224,19 +224,23 @@ if(USE_CUDA) | |||
# reference for mapping of CUDA toolkit component versions to supported architectures ("compute capabilities"): | |||
# https://en.wikipedia.org/wiki/CUDA#GPUs_supported | |||
set(CUDA_ARCHS "60" "61" "62" "70" "75") | |||
if(CUDA_VERSION VERSION_GREATER_EQUAL "110") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"110" means exactly 110 version during comparison, VERSION_GREATER_EQUAL
doesn't know whether and where we want to put a .
: 11.0
or maybe 1.10
.
set_target_properties( | ||
lightgbm_objs | ||
PROPERTIES | ||
CUDA_ARCHITECTURES ${CUDA_ARCHS} | ||
CUDA_ARCHITECTURES "${CUDA_ARCHS}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent the following error:
-- Using _mm_malloc
CMake Error at CMakeLists.txt:572 (set_target_properties):
set_target_properties called with incorrect number of arguments.
CMake Error at CMakeLists.txt:579 (set_target_properties):
set_target_properties called with incorrect number of arguments.
CUDA_ARCHS
were passed in the following form: 60616270758086878990100120+PTX
.
list(TRANSFORM CUDA_ARCHS APPEND "-real") | ||
list(APPEND CUDA_ARCHS "${CUDA_LAST_SUPPORTED_ARCH}-real" "${CUDA_LAST_SUPPORTED_ARCH}-virtual") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the following error:
[33/70] Building CUDA object CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o
FAILED: CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o
/usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++ -DEIGEN_DONT_PARALLELIZE -DEIGEN_MPL2_ONLY -DMM_MALLOC -DMM_PREFETCH -DUSE_CUDA -DUSE_SOCKET -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/eigen -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/fast_double_parser/include -I/__w/LightGBM/LightGBM/lightgbm-python/external_libs/fmt/include -I/usr/local/cuda/targets/x86_64-linux/include -I/__w/LightGBM/LightGBM/lightgbm-python/include -Xcompiler=-fopenmp -Xcompiler=-fPIC -Xcompiler=-Wall -O3 -lineinfo -O3 -DNDEBUG -std=c++11 "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_61,code=[compute_61,sm_61]" "--generate-code=arch=compute_62,code=[compute_62,sm_62]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_87,code=[compute_87,sm_87]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" "--generate-code=arch=compute_100,code=[compute_100,sm_100]" "--generate-code=arch=compute_120+PTX,code=[compute_120+PTX,sm_120+PTX]" -MD -MT CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o -MF CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o.d -x cu -rdc=true -c /__w/LightGBM/LightGBM/lightgbm-python/src/boosting/cuda/cuda_score_updater.cu -o CMakeFiles/lightgbm_objs.dir/src/boosting/cuda/cuda_score_updater.cu.o
nvcc fatal : Unsupported gpu architecture 'compute_120+PTX'
Borrowed from XGBoost:
https://github.com/dmlc/xgboost/blob/a46585a36c4bf30bfd58a2653fe8ae40beea25ce/cmake/Utils.cmake#L73-L74
https://github.com/dmlc/xgboost/actions/runs/13056819151/job/36429848864#step:6:162
@@ -32,7 +32,7 @@ if [ "$PY_MINOR_VER" -gt 7 ]; then | |||
--inspect \ | |||
--ignore 'compiled-objects-have-debug-symbols'\ | |||
--ignore 'distro-too-large-compressed' \ | |||
--max-allowed-size-uncompressed '100M' \ | |||
--max-allowed-size-uncompressed '120M' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid
------------ check results -----------
1. [distro-too-large-uncompressed] Uncompressed size 0.1G is larger than the allowed size (100.0M).
errors found while checking: 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thank you so much!!
I'm really happy we were able to get Blackwell support into the next release 😁 |
Refer to dmlc/xgboost#11187 and https://en.wikipedia.org/wiki/CUDA#:~:text=GB10%20(%3F)-,12.0,-GB202%2C%20GB203%2C%20GB205