Fix Aria tests #37444

jiqing-feng · 2025-04-11T06:49:45Z

Reproduce:
TRANSFORMERS_TEST_DEVICE=cuda RUN_SLOW=1 pytest -rA tests/models/aria/test_modeling_aria.py::AriaForConditionalGenerationIntegrationTest::test_batched_generation

Error log:

FAILED tests/models/aria/test_modeling_aria.py::AriaForConditionalGenerationIntegrationTest::test_batched_generation - RuntimeError: INDICES element is out of DATA bounds, id=100352 axis_dim=100352

I found 4 issues in this case:

The pad_token_id is the same as vocab_size which will cause out of bounds. It cames from both config.json and added_tokens.json have pad token.
The input image and input prompt should have same batch size
The input pixel value shoule be cast to the model.dtype, see the model hub usage
This model uses torch MHA, which will read the weight directly and apply torch.matmul without dequantizing the weights. We should skip quantizing the MHA weight.

After fixing the 4 issues, the test can correctly run.

github-actions · 2025-04-11T06:49:57Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

jiqing-feng · 2025-04-11T06:53:04Z

Hi @ydshieh @SunMarc . Please review this PR. Thanks!

Signed-off-by: jiqing-feng <[email protected]>

Rocketknight1 · 2025-04-11T15:56:06Z

cc @zucchini-nlp for VLMs as well

zucchini-nlp · 2025-04-14T09:15:49Z

run-slow: aria

github-actions · 2025-04-14T09:17:01Z

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/aria']
quantizations: [] ...

HuggingFaceDocBuilderDev · 2025-04-14T09:41:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ydshieh

Hi @jiqing-feng Thank you a lot. I left a few comments.

ydshieh · 2025-04-14T09:50:40Z

tests/models/aria/test_modeling_aria.py

        processor = AutoProcessor.from_pretrained("rhymes-ai/Aria")
+        processor.tokenizer.pad_token_id = model.config.pad_token_id


The pad_token_id is the same as vocab_size which will cause out of bounds. It cames from both config.json and added_tokens.json have pad token.

Do you know if this is also an issue on the Hub repository which needs a fix too?

+1, if pad token is OOV for official weights, we'll see errors. For ex, in batch generation where one sequence emits EOS earlier than the other

I can see there are 2 different pad token in the model hub and am not sure if it's reasonable, but I can make sure transformers didn't handle this case. We can either change the model hub or handle this kind of case in transformers.

I see now, the tokenizer has a new pad token added and the embeddings were not resized for that. Yes, can you open a PR in the hub repo to remove <pad> from tokenizer and assign pad token to <end_of_text> token? Thanks

The tokenizer_config.json there still have

"100352": { "content": "<pad>", "lstrip": false, "normalized": false, "rstrip": false, "single_word": false, "special": true }

I think what @zucchini-nlp means is: load the tokenizer from Hub, assign pad_token_id to 2 , save it , and open a PR with the newly saved files.

But what <end_of_text> means here @zucchini-nlp , do you simply means the eos_token_id which is 2 in the config ..?

Exactly, that is the token with id=2, in other words the id specified in config file. We need to use the same pad token on config and in tokenizer for consistency, otherwise it might raise errors as in this PR already

@jiqing-feng the PR you did is great but unfortunately doesn't solve the issue. Removing the added_tokens file is not going to remove <pad> from vocab entirely. I'd recommend to do as suggested by @ydshieh , load assign and save back. Currently we still have <pad> in special_token_map.json and tokenizer.json

This change can correct assign the pad token id to 2, please check it. Thanks

tests/models/aria/test_modeling_aria.py

ydshieh · 2025-04-14T10:01:34Z

tests/models/aria/test_modeling_aria.py

+        prompts = [processor.apply_chat_template([message], add_generation_prompt=True) for message in messages]
+        images = [[image1, image2], [image2]]
+        inputs = processor(text=prompts, images=images, padding=True, return_tensors="pt").to(model.device)
+        inputs["pixel_values"] = inputs["pixel_values"].to(model.dtype)
+
+        EXPECTED_OUTPUT = (
+            [
+                "<|im_start|>user\n<fim_prefix><fim_suffix> <image>\n <image>\n USER: What's the difference of two images?\n ASSISTANT:<fim_prefix><fim_suffix> <image>\n USER: Describe the image.\n ASSISTANT:<|im_end|>\n <|im_start|>assistant\n The first image features a cute, light-colored puppy sitting on a paved surface with",
+                "<|im_start|>user\n<fim_prefix><fim_suffix> <image>\n USER: Describe the image.\n ASSISTANT:<|im_end|>\n <|im_start|>assistant\n The image shows a young alpaca standing on a grassy hill. The alpaca has",
+            ],  # cpu output
+            [
+                "<|im_start|>user\n<fim_prefix><fim_suffix> <image>\n <image>\n USER: What's the difference of two images?\n ASSISTANT:<fim_prefix><fim_suffix> <image>\n USER: Describe the image.\n ASSISTANT:<|im_end|>\n <|im_start|>assistant\n The first image features a cute, light-colored puppy sitting on a paved surface with",
+                "<|im_start|>user\n<fim_prefix><fim_suffix> <image>\n USER: Describe the image.\n ASSISTANT:<|im_end|>\n <|im_start|>assistant\n The image shows a young alpaca standing on a patch of ground with some dry grass. The",
+            ],  # cuda output
+            [
+                "<|im_start|>user\n<fim_prefix><fim_suffix> <image>\n <image>\n USER: What's the difference of two images?\n ASSISTANT:<fim_prefix><fim_suffix> <image>\n USER: Describe the image.\n ASSISTANT:<|im_end|>\n <|im_start|>assistant\n The first image shows a cute, light-colored puppy sitting on a paved surface with",
+                "<|im_start|>user\n<fim_prefix><fim_suffix> <image>\n USER: Describe the image.\n ASSISTANT:<|im_end|>\n <|im_start|>assistant\n The image shows a young alpaca standing on a grassy hill. The alpaca has",
+            ],  # xpu output
+        )


Could you try to apply similar changes as in https://github.com/huggingface/transformers/pull/37126/files, the changes in tests/models/llama/test_modeling_llama.py there 🙏

Currently we didn't find any failed tests on llama. To avoid confuse, maybe we can do it in our next PR. Let's focus on Aria in this PR.

tests/models/aria/test_modeling_aria.py

zucchini-nlp · 2025-04-14T12:05:01Z

Hm slow tests are oom-ing for aria, do you think we need to add @require_torch_large_gpu @ydshieh ?

ydshieh · 2025-04-14T12:32:36Z

Hm slow tests are oom-ing for aria, do you think we need to add @require_torch_large_gpu @ydshieh ?

Add @require_torch_large_gpu means it will not be run in daily CI, only run in some occasions that team members have less focus to check.

However, all those integration tests already use 4-bits and there is no smaller model size for this model architecture. Let's add it anyway.

Thank you for bring this issue up 👍

jiqing-feng · 2025-04-16T03:09:56Z

Hi @ydshieh @zucchini-nlp . I have fixed your comments, please review it again. Thanks!

Signed-off-by: jiqing-feng <[email protected]>

zucchini-nlp

Thanks @jiqing-feng ! Looks good to me, left one tiny comment so we get better error messages when tests fails

Approving but before merging can you update hub config as suggested in above comment, and remove the line processor.tokenizer.pad_token_id = model.config.pad_token_id from tests (if pad token is assigned correctly, we shouldn't explicitly set it to eos again)

tests/models/aria/test_modeling_aria.py

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-04-22T00:58:29Z

Thanks @jiqing-feng ! Looks good to me, left one tiny comment so we get better error messages when tests fails

Approving but before merging can you update hub config as suggested in above comment, and remove the line processor.tokenizer.pad_token_id = model.config.pad_token_id from tests (if pad token is assigned correctly, we shouldn't explicitly set it to eos again)

Yes, I will remove it once 20 is merged

jiqing-feng · 2025-04-23T04:42:00Z

Hi @ydshieh @zucchini-nlp The model hub PR is merged, I also removed the pad_token setting. The failed tests seem not related to my changes.

zucchini-nlp · 2025-04-23T07:54:12Z

@ydshieh can you merge?

Signed-off-by: jiqing-feng <[email protected]>

zucchini-nlp · 2025-04-23T15:20:50Z

Tests were fixed on main, a rebase should help now

jiqing-feng · 2025-04-24T01:26:29Z

Hi @zucchini-nlp . Yes, the tests all passed after I rebased the main branch. This PR is ready to be merged.

* update aria tests Signed-off-by: jiqing-feng <[email protected]> * add cuda tests Signed-off-by: jiqing-feng <[email protected]> * check outputs for cpu and cuda and xpu Signed-off-by: jiqing-feng <[email protected]> * check outputs for cpu and cuda and xpu Signed-off-by: jiqing-feng <[email protected]> * check outputs for cpu and cuda and xpu Signed-off-by: jiqing-feng <[email protected]> * check output for each device Signed-off-by: jiqing-feng <[email protected]> * fix style Signed-off-by: jiqing-feng <[email protected]> * fix style Signed-off-by: jiqing-feng <[email protected]> * fix xpu output Signed-off-by: jiqing-feng <[email protected]> * add comments and use assert list equal Signed-off-by: jiqing-feng <[email protected]> * rm pad token assign Signed-off-by: jiqing-feng <[email protected]> --------- Signed-off-by: jiqing-feng <[email protected]>

github-actions bot marked this pull request as draft April 11, 2025 06:49

jiqing-feng marked this pull request as ready for review April 11, 2025 06:52

github-actions bot requested a review from ydshieh April 11, 2025 06:52

jiqing-feng added 6 commits April 11, 2025 10:34

update aria tests

510f8aa

Signed-off-by: jiqing-feng <[email protected]>

add cuda tests

6a79f35

Signed-off-by: jiqing-feng <[email protected]>

check outputs for cpu and cuda and xpu

cd1f584

Signed-off-by: jiqing-feng <[email protected]>

check outputs for cpu and cuda and xpu

e95d13b

Signed-off-by: jiqing-feng <[email protected]>

check outputs for cpu and cuda and xpu

09da2fa

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into aria

89c8996

Merge branch 'main' into aria

341de28

ydshieh reviewed Apr 14, 2025

View reviewed changes

zucchini-nlp reviewed Apr 14, 2025

View reviewed changes

tests/models/aria/test_modeling_aria.py Outdated Show resolved Hide resolved

jiqing-feng requested review from ydshieh and zucchini-nlp April 16, 2025 03:09

jiqing-feng added 5 commits April 16, 2025 09:48

check output for each device

d07d140

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into aria

15d3f2e

fix style

e91ef1c

Signed-off-by: jiqing-feng <[email protected]>

fix style

45ff186

Signed-off-by: jiqing-feng <[email protected]>

fix xpu output

783a19d

Signed-off-by: jiqing-feng <[email protected]>

zucchini-nlp approved these changes Apr 16, 2025

View reviewed changes

tests/models/aria/test_modeling_aria.py Outdated Show resolved Hide resolved

Merge branch 'main' into aria

098a2ef

add comments and use assert list equal

5eca3e0

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into aria

396c849

jiqing-feng added 2 commits April 23, 2025 10:23

rm pad token assign

db24764

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into aria

457c272

Merge branch 'main' into aria

c85172d

zucchini-nlp merged commit b7f7aa7 into huggingface:main Apr 24, 2025
14 checks passed

ydshieh mentioned this pull request Apr 24, 2025

Skip all AriaForConditionalGenerationIntegrationTest on T4 #37746

Merged

		processor = AutoProcessor.from_pretrained("rhymes-ai/Aria")
		processor.tokenizer.pad_token_id = model.config.pad_token_id

Fix Aria tests #37444

Fix Aria tests #37444

Uh oh!

Conversation

jiqing-feng commented Apr 11, 2025

Uh oh!

github-actions bot commented Apr 11, 2025

Uh oh!

jiqing-feng commented Apr 11, 2025

Uh oh!

Rocketknight1 commented Apr 11, 2025

Uh oh!

zucchini-nlp commented Apr 14, 2025

Uh oh!

github-actions bot commented Apr 14, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 14, 2025

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zucchini-nlp commented Apr 14, 2025

Uh oh!

ydshieh commented Apr 14, 2025

Uh oh!

jiqing-feng commented Apr 16, 2025

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jiqing-feng commented Apr 22, 2025

Uh oh!

jiqing-feng commented Apr 23, 2025

Uh oh!

zucchini-nlp commented Apr 23, 2025

Uh oh!

zucchini-nlp commented Apr 23, 2025

Uh oh!

jiqing-feng commented Apr 24, 2025

Uh oh!

Uh oh!

Uh oh!

zucchini-nlp Apr 16, 2025 •

edited

Loading