Skip to content

Conversation

akshay-babbar
Copy link

@akshay-babbar akshay-babbar commented Aug 30, 2025

Problem

Fixes #12116

Short prompts generate corrupted images due to attention mask dtype conversion bug.

Root Cause

Attention masks converted from bool → float16/bfloat16, but PyTorch's scaled_dot_product_attention requires boolean masks.

Solution

  • Preserve boolean dtype in _get_t5_prompt_embeds
  • Remove dtype conversions in _prepare_attention_mask
  • Fix both positive and negative attention masks

Testing

✅ Added @slow unit tests for dtype preservation
✅ Verified fix with prompts: "man", "cat"
✅ All tests pass locally

Please review when you have a chance. Thank you for your time and consideration!

- Convert attention masks to bool and prevent dtype corruption
- Fix both positive and negative mask handling in _get_t5_prompt_embeds
- Remove float conversion in _prepare_attention_mask method

Fixes huggingface#12116
@akshay-babbar akshay-babbar changed the title Fix #12116: preserve boolean dtype for attention masks in ChromaPipeline Fix #12116 : preserve boolean dtype for attention masks in ChromaPipeline Aug 30, 2025
@akshay-babbar akshay-babbar changed the title Fix #12116 : preserve boolean dtype for attention masks in ChromaPipeline Fix #12116 preserve boolean dtype for attention masks in ChromaPipeline Aug 30, 2025
@akshay-babbar akshay-babbar changed the title Fix #12116 preserve boolean dtype for attention masks in ChromaPipeline Fix #12116: preserve boolean dtype for attention masks in ChromaPipeline Aug 30, 2025
@akshay-babbar
Copy link
Author

akshay-babbar commented Sep 1, 2025

hello @DN6 @yiyixuxu , can you please review this PR and share feedback!
Thanks!

@akshay-babbar
Copy link
Author

hello @DN6, just checking in to see if you’ve had a chance to look at the above PR. If you’re not the right person or are keeping busy, would you mind pointing me to someone who could review it?

Thanks!

@yiyixuxu yiyixuxu requested a review from DN6 September 10, 2025 19:51
@yiyixuxu
Copy link
Collaborator

thanks @akshay-babbar
can you show outputs before/after?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@akshay-babbar
Copy link
Author

akshay-babbar commented Sep 13, 2025

hello @yiyixuxu @DN6

Thanks for the response! I'm new to diffusers, so still learning best practices through the docs and codebase. Do let me know if there are any issues with my changes.

I used these these 3 prompts - [man,king, doctor]

negative prompt used - blurry, low quality, naked, NSFW, nude, deformed

Below are the results!

Please review and let me know your feedback and any next steps.

Thanks!

Before

Man

Before_v2_change_1_man

Doctor

Before_v2_change_2_doctor

King

Before_v2_change_4_king

After Code Changes

Man

After_v2_change_1_man

Doctor

After_v2_change_2_doctor

King

After_v2_change_4_king

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attention masking in Chroma pipeline
3 participants