Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP-Adapter for FluxImg2ImgPipeline #10717

Closed
wants to merge 3 commits into from

Conversation

guiyrt
Copy link
Contributor

@guiyrt guiyrt commented Feb 4, 2025

What does this PR do?

Adds image prompting support for FluxImg2ImgPipeline via FluxIPAdapterMixin, as part of #10689.

Example output:
astro_grid

I did some minor changes on pipeline_flux.py regarding documentation and argument order, for consistency.
I also adapted a bit the ip_adapter code for clarity, hope you find it simpler this way :)
I'm not sure if you prefer it this way, but I made a change on FluxIPAdapterPipelineSlowTests so that the rest of the pipelines can use it simply by adding input arguments and changing expected output. Let me know if subclassing like this is ok, or if you prefer having the full class definition for each pipeline test.

Inference code
import torch

from diffusers import FluxImg2ImgPipeline
from diffusers.utils import load_image

model_id = "black-forest-labs/FLUX.1-dev"
image_encoder_id = "openai/clip-vit-large-patch14"
ip_adapter_id = "XLabs-AI/flux-ip-adapter"

pipe = FluxImg2ImgPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16
)

# Load IP Adapter
pipe.load_ip_adapter(
    ip_adapter_id,
    weight_name="ip_adapter.safetensors",
    image_encoder_pretrained_model_name_or_path=image_encoder_id
)
pipe.set_ip_adapter_scale(1.0)

pipe.enable_sequential_cpu_offload()

reference_img = load_image("astronaut.jpg")
ip_adapter_img = load_image("../reference_images/abstract.png")

image = pipe(
    image=reference_img,
    width=1024,
    height=1024,
    negative_prompt="lowres, low quality, worst quality",
    generator=torch.manual_seed(42),
    ip_adapter_image=ip_adapter_img,
    guidance_scale=15,
    num_inference_steps=40,
    strength=0.85,
    prompt="an astronaut in space"
).images[0]

image.save("result.jpg")

Before submitting

Who can review?

@hlky @yiyixuxu

@guiyrt
Copy link
Contributor Author

guiyrt commented Feb 4, 2025

While testing I also found that if you load the IP-Adapter, you must provide image prompt otherwise the pipeline fails. I will open a PR for that, but it's just a check for ip_hidden_states is None in FluxIPAdapterJointAttnProcessor2_0.

Copy link
Collaborator

@hlky hlky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @guiyrt. I had opened #10708 to cover FluxImg2Img and others.

While testing I also found that if you load the IP-Adapter, you must provide image prompt otherwise the pipeline fails.

Is this the same behavior as other IP-Adapter? I believe it may be expected that we call unload_ip_adapter if we no longer want to use ip_adapter_image.

Comment on lines +941 to +972
# handle image prompt
any_ip_adapter_input = ip_adapter_image is not None or ip_adapter_image_embeds is not None
any_negative_ip_adapter_input = (
negative_ip_adapter_image is not None or negative_ip_adapter_image_embeds is not None
)

if any_ip_adapter_input and not any_negative_ip_adapter_input:
negative_ip_adapter_image = np.zeros((width, height, 3), dtype=np.uint8)
elif any_negative_ip_adapter_input and not any_ip_adapter_input:
ip_adapter_image = np.zeros((width, height, 3), dtype=np.uint8)

image_embeds = (
self.prepare_ip_adapter_image_embeds(
ip_adapter_image,
ip_adapter_image_embeds,
device,
batch_size * num_images_per_prompt,
)
if any_ip_adapter_input
else None
)

negative_image_embeds = (
self.prepare_ip_adapter_image_embeds(
negative_ip_adapter_image,
negative_ip_adapter_image_embeds,
device,
batch_size * num_images_per_prompt,
)
if any_negative_ip_adapter_input
else None
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed, we can just copy from text-to-image pipeline, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does the same thing, I just created the variables any_ip_adapter_input and any_negative_ip_adapter_input for readability, as we have many is None checks chained with bool logical ops.



class FluxIPAdapterImg2ImgPipelineSlowTests(FluxIPAdapterPipelineSlowTests):
"""Same test as in FluxIPAdapterPipelineSlowTests, only with inital `image` and `strength` parameters."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need slow tests for IP-Adapter in each pipeline variant, the general functionality is tested with the fast test FluxIPAdapterTesterMixin.

@guiyrt
Copy link
Contributor Author

guiyrt commented Feb 4, 2025

Thanks @guiyrt. I had opened #10708 to cover FluxImg2Img and others.

Oh, I started this yesterday and didn't notice 😢 bad timing, we can close this then

While testing I also found that if you load the IP-Adapter, you must provide image prompt otherwise the pipeline fails.

Is this the same behavior as other IP-Adapter? I believe it may be expected that we call unload_ip_adapter if we no longer want to use ip_adapter_image.

In SD3 you can load the IP-Adapter and not pass an image prompt, and it behaves like the unloaded SD3 pipeline. This can be useful when you don't control the pipeline args, such as in a gradio demo or just general model serving

@guiyrt
Copy link
Contributor Author

guiyrt commented Feb 5, 2025

Thanks for still taking the time to review @hlky, is there anything else with priority open for the community? I can also check the roadmap, but if you have something in mind let me know :)

@guiyrt guiyrt closed this Feb 5, 2025
@hlky
Copy link
Collaborator

hlky commented Feb 5, 2025

Would you like to try MultiControlNet-like support in ControlNet Union as a companion to #10723? See #10656 for context.

We need to restore all the MultiControlNet related code using StableDiffusionXLControlNetPipeline as a reference.

class StableDiffusionXLControlNetPipeline(

I'll handle integrating MultiControlNet and the experimental per control type scale support either in that PR or yours, depending on the merge order. For example this will be either len(control_mode) or len(controlnet.nets).

@guiyrt
Copy link
Contributor Author

guiyrt commented Feb 6, 2025

Would you like to try MultiControlNet-like support in ControlNet Union as a companion to #10723? See #10656 for context.

I'll get on it 🚀

@guiyrt guiyrt deleted the flux_ipa_img2img branch February 6, 2025 15:13
@guiyrt
Copy link
Contributor Author

guiyrt commented Feb 7, 2025

@hlky If I got it right, the idea is to make MultiControlNetModel accept ControlNetUnionModel, and change StableDiffusionXLControlNetUnionPipeline to have the controlnet property as MultiControlNetModel, so the interface is consistent, is this it? Something like this:

controlnet: Union[
    ControlNetUnionModel,
    List[Union[ControlNetModel, ControlNetUnionModel]],
    Tuple[Union[ControlNetModel, ControlNetUnionModel]],
    MultiControlNetModel,
]

I noticed the PR you have open regarding the experiments with per control type scale, but from your previous message I didn't get if you already started with the integration of ControlNetUnionModel into MultiControlNetModel and want me to handle changes to StableDiffusionXLControlNetUnionPipeline, or if my PR would include both. Just wanted to clarify to avoid duplicated work :)

I didn't know about ControlNet union, very interesting work, thanks for pointing me towards this! :)

@hlky
Copy link
Collaborator

hlky commented Feb 7, 2025

Yes, so MultiControlNetModel and the variants like FluxMultiControlNetModel are designed to support multiple control types. When we pass controlnet_conditioning_scale for multiple control types it is handled by MultiControlNetModel and an individual scale value is passed to each ControlNetModel. When integrating ControlNetUnion I updated the original pipelines against the existing XL ControlNet but ControlNetUnion natively supports multiple control types so removed code related to MultiControlNetModel. However multiple scale values are not supported because we cannot apply multiple scale values in the usual place. My experimental PR applies the multiple scale values at a different location. We'd still like to support the MultiControlNet case for consistency.

In this case we need a MultiControlNetUnionModel as ControlNetUnion like FluxControlNetModel has a different interface than ControlNetModel.

StableDiffusionXLControlNetUnionPipeline (and other ControlNetUnion pipelines) will have the controlnet property like:

controlnet: Union[
            ControlNetUnionModel, List[ControlNetUnionModel], Tuple[ControlNetUnionModel], MultiControlNetUnionModel
        ],

Essentially compare to StableDiffusionXLControlNetPipeline and add any MultiControlNetModel code but use MultiControlNetUnionModel and ControlNetUnionModel in place of MultiControlNetModel and ControlNetModel.

@guiyrt
Copy link
Contributor Author

guiyrt commented Feb 7, 2025

In this case we need a MultiControlNetUnionModel as ControlNetUnion like FluxControlNetModel has a different interface than ControlNetModel.

Yeah, this was my main concern if we tried to just glue together ControlNetUnion into MultiControlNetModel.

Got it, much clearer now, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants