Skip to content

Conversation

@MilkClouds
Copy link
Contributor

@MilkClouds MilkClouds commented Sep 9, 2025

Summary

This PR added InternVL3 models(e.g. https://huggingface.co/OpenGVLab/InternVL3-1B-hf) to liger-kernel. From InternVL2.5-MPO(e.g. https://huggingface.co/OpenGVLab/InternVL2_5-8B-MPO-hf) to InternVL3.5 they share same model type internvl but I focused on supporting InternVL3. Other versions may work but not sure.
This model is present after transformers v4.52.1: https://github.com/huggingface/transformers/releases/tag/v4.52.1

Testing Done

  • Hardware Type: H100
  • run make test to ensure correctness: all passed
====================================================================================================== 2072 passed, 215 skipped, 44 warnings in 918.28s (0:15:18) =======================================================================================================
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence: glm fail, not sure why
======================================================================================================================== short test summary info ========================================================================================================================
FAILED test/convergence/bf16/test_mini_models.py::test_mini_model[mini_glm4v-32-1e-05-dtype14-0.01-0.01-0.1-0.01-0.01-0.01] - AssertionError: [Loss]Number of mismatched elements: 3
FAILED test/convergence/bf16/test_mini_models.py::test_mini_model[mini_glm4v_moe-32-1e-05-dtype15-0.01-0.2-0.1-0.01-0.01-0.01] - AssertionError: [Loss]Number of mismatched elements: 2
========================================================================================================== 2 failed, 19 passed, 1 warning in 117.34s (0:01:57) ==========================================================================================================

Important note for testing

Existing test failed for multimodal model test, so I fixed some implementation, not sure I've done right. Please have a look and give me review.

https://github.com/linkedin/Liger-Kernel/pull/878/files#diff-76dbff2421c7cc7c9833d3bfaa5e1c55da5ad7ce96087d356aefb7d340acaf5bR954-R962

@MilkClouds
Copy link
Contributor Author

@shimizust Can you have it a look for this PR?

eval_batch = next(loader_iter).to(model.device)
if with_liger:
eval_batch["skip_logits"] = False
with torch.no_grad():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thanks for fixing this!

@momochen
Copy link
Collaborator

momochen commented Oct 5, 2025

Thanks for adding this model, and the test for fixing multi-modal model evaluation. LGTM.

@shimizust shimizust merged commit 9584b94 into linkedin:main Oct 5, 2025
@MilkClouds MilkClouds mentioned this pull request Oct 26, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants