The weights after sft of video data cannot be inferred #33

orzgugu · 2024-12-11T08:12:40Z

I used the LongVU/scripts/train_video_qwen.sh script to perform SFT on the video data. I used the checkpoint obtained from the training to run inference with the inference code you provided, but an error occurred. The error message is as follows:

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Loading checkpoint shards: 0%| | 0/6 [00:05<?, ?it/s] Traceback (most recent call last): File "/.../LongVU/scripts/inference_video.py", line 20, in <module> tokenizer, model, image_processor, context_len = load_pretrained_model( File "/.../LongVU/longvu/builder.py", line 159, in load_pretrained_model model = CambrianQwenForCausalLM.from_pretrained( File "/.../conda_env/longuv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3838, in from_pretrained ) = cls._load_pretrained_model( File "/.../conda_env/longuv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4298, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/.../conda_env/longuv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 895, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/.../conda_env/longuv/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 373, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([544997376]) in "weight" (which has shape torch.Size([152064, 3584])), this looks incorrect.

I am not sure if this is due to the inference code you provided being incompatible with the checkpoint after SFT. I really need your help! It’s very urgent, and I would greatly appreciate your assistance!

The text was updated successfully, but these errors were encountered:

Amshaker · 2024-12-16T15:06:36Z

Same problem.

orzgugu · 2024-12-17T03:25:46Z

Same problem.

Delete the safetensors suffix file, delete the model.safetensors.index.json file, and rename pytorch_model_fsdp.bin to pytorch_model.bin

HenryHZY · 2025-01-04T01:20:59Z

Same problem.

Delete the safetensors suffix file, delete the model.safetensors.index.json file, and rename pytorch_model_fsdp.bin to pytorch_model.bin

This works for me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The weights after sft of video data cannot be inferred #33

The weights after sft of video data cannot be inferred #33

orzgugu commented Dec 11, 2024

Amshaker commented Dec 16, 2024

orzgugu commented Dec 17, 2024

HenryHZY commented Jan 4, 2025

The weights after sft of video data cannot be inferred #33

The weights after sft of video data cannot be inferred #33

Comments

orzgugu commented Dec 11, 2024

Amshaker commented Dec 16, 2024

orzgugu commented Dec 17, 2024

HenryHZY commented Jan 4, 2025