bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_managed_ptr #6851

missTL · 2025-02-07T17:03:03Z

Reminder

I have read the above rules and searched the existing issues.

System Info

[2025-01-14 10:19:36,904] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] async_io: please install the libaio-devel package with yum
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.5
[WARNING] using untested triton version (3.1.0), only 1.0.0 is known to be compatible
/home/zengshuang.zs/anaconda3/envs/mllm/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:49: FutureWarning: is deprecated. Please use instead.
def forward(ctx, input, weight, bias=None):
/home/zengshuang.zs/anaconda3/envs/mllm/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:67: FutureWarning: is deprecated. Please use instead.
def backward(ctx, grad_output):
Could not find the bitsandbytes CUDA binary at PosixPath('/home/zengshuang.zs/anaconda3/envs/mllm/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda124.so')
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.torch.cuda.amp.custom_fwd(args...)torch.amp.custom_fwd(args..., device_type='cuda')torch.cuda.amp.custom_bwd(args...)torch.amp.custom_bwd(args..., device_type='cuda')

llamafactory version: 0.9.2.dev0
Platform: Linux-3.10.0-1160.119.1.el7.x86_64-x86_64-with-glibc2.17
Python version: 3.10.16
PyTorch version: 2.5.1+cu124 (GPU)
Transformers version: 4.46.1
Datasets version: 3.1.0
Accelerate version: 1.0.1
PEFT version: 0.12.0
TRL version: 0.9.6
GPU type: NVIDIA RTX A6000
DeepSpeed version: 0.14.4
Bitsandbytes version: 0.45.0
vLLM version: 0.6.4.post1

Reproduction

为了防止OOM，使用optim: paged_adamw_8bit会报错如下：AttributeError: /home/zengshuang.zs/anaconda3/envs/mllm/lib/python3.10/site-packges/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_managed_ptr。

train_config:

### model
model_name_or_path: /home/zengshuang.zs/.cache/modelscope/hub/qwen/Qwen2-VL-7B-Instruct
image_resolution: 524288
video_resolution: 16384
trust_remote_code: true

enable_liger_kernel: true
use_unsloth_gc: true
flash_attn: fa2

### method
stage: sft
do_train: true
finetuning_type: full
freeze_vision_tower: true  # choices: [true, false]
freeze_multi_modal_projector: true  # choices: [true, false]
train_mm_proj_only: false  # choices: [true, false]
deepspeed: examples/deepspeed/ds_z3_config.json  # choices: [ds_z0_config.json, ds_z2_config.json, ds_z3_config.json]

### dataset
dataset: train_motion
template: qwen2_vl
cutoff_len: 5000
max_samples: 100000
overwrite_cache: true
preprocessing_num_workers: 16

### output
output_dir: /home/zengshuang.zs/output/llm/v2.8
logging_steps: 10
save_steps: 1000
plot_loss: true
overwrite_output_dir: true

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 2
learning_rate: 1.0e-4
num_train_epochs: 8.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
optim: paged_adamw_8bit


## eval
val_size: 0.0001
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500

Others

No response

The text was updated successfully, but these errors were encountered:

missTL added bug Something isn't working pending This problem is yet to be addressed labels Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_managed_ptr #6851

bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_managed_ptr #6851

missTL commented Feb 7, 2025

bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_managed_ptr #6851

bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_managed_ptr #6851

Comments

missTL commented Feb 7, 2025

Reminder

System Info

Reproduction

Others