ControlNet union pipeline fails on multi-model #10656

vladmandic · 2025-01-26T19:51:39Z

Describe the bug

All controlnet types are typically defined inside pipeline as below (example from StableDiffusionXLControlNetPipeline):

controlnet: Union[ControlNetModel, List[ControlNetModel], Tuple[ControlNetModel], MultiControlNetModel],

however, StableDiffusionXLControlNetUnionPipeline pipeline defines it simply as:

controlnet: ControlNetUnionModel

which defeats one of the main advantages of union controlnet - to be able to perform multiple guidances using same model.
for reference, controlnetunion was added via pr #10131
any changes to txt2img pipeline should also be mirrored in img2img and inpaint pipelines.

Reproduction

control1 = ControlNetUnionModel.from_single_file(...)
control2 = ControlNetUnionModel.from_single_file(...)
pipe = StableDiffusionXLControlNetUnionPipeline.from_single_file(..., control=[control1, control2])

Logs

│    256 │   │   if not isinstance(controlnet, ControlNetUnionModel):                                                                                                                                                                                                                                                                                                                                                         │
│ ❱  257 │   │   │   raise ValueError("Expected `controlnet` to be of type `ControlNetUnionModel`.")                                                                                                                                                                                                                                                                                                                          │
│    258                                                                                                                                                                                                                                                                                                                                                                                                                      │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Expected `controlnet` to be of type `ControlNetUnionModel`.

System Info

diffusers==0.33.0.dev0

Who can help?

@hlky @yiyixuxu @sayakpaul @DN6

The text was updated successfully, but these errors were encountered:

hlky · 2025-01-26T19:58:39Z

The advantage of controlnet union is that we don't need to use multiple controlnet models, no? We just use multiple control conditionings with the same controlnet union model. What is the use case here?

vladmandic · 2025-01-26T21:40:25Z

lets say i want to run openpose and canny. how?
i'm guessing you're saying to specify control_mode as list instead of int and everything just works if i send two control_images as input list instead?

issue is that i need to match number of all params - control_image, controlnet_conditioning_scale, start, end, etc.
and having StableDiffusionXLControlNetUnionPipeline behave totally differently than any other controlnet pipeline in any other base model makes it a massive if/then piece of code when all controlnet pipelines should supposedly behave in uniform way.

hlky · 2025-01-26T22:19:00Z

cc @yiyixuxu

vladmandic · 2025-01-27T15:59:43Z

ok, using single ControlNetUnionModel in StableDiffusionXLControlNetUnionPipeline
and passing two inputs as list for each arg FAILS.

'control_mode': [3, 0], # canny and openpose
'controlnet_conditioning_scale': [0.5, 0.8], # different strength for each conditioning model
'control_guidance_start': [0.1, 0.2], # explicit start
'control_guidance_end': [0.8, 0.9], # explicit stop
'control_image': [<PIL.Image.Image image mode=RGB size=768x768 at 0x773051C06DE0>, <PIL.Image.Image image mode=RGB size=768x768 at 0x773051F37710>], # preprocessed inputs with canny and openpose processors

TypeError: For single controlnet: controlnet_conditioning_scale must be type float.

@hlky so intended behavior as you noted seems to be broken?

hlky · 2025-01-27T16:20:08Z

@vladmandic Apologies for the oversight, controlnet_conditioning_scale with multiple inputs was not tested. PR to fix that issue: #10666

yiyixuxu · 2025-01-27T18:21:59Z

fixed in #10666 :)

vladmandic · 2025-01-27T20:12:26Z

can we wait with closing the issue until its confirmed - either internally or externally?
this needs to be end-to-end for all args, now it fails with:

float(i / len(timesteps) < control_guidance_start or (i + 1) / len(timesteps) > control_guidance_end)
TypeError: '<' not supported between instances of 'float' and 'list'

yiyixuxu · 2025-01-27T20:54:06Z

oh sorry, reopen it

hlky · 2025-01-27T21:52:02Z

We can only apply a single scale here

diffusers/src/diffusers/models/controlnets/controlnet_union.py

Lines 813 to 820 in fb42066

    
           if guess_mode and not self.config.global_pool_conditions: 
        
               scales = torch.logspace(-1, 0, len(down_block_res_samples) + 1, device=sample.device)  # 0.1 to 1.0 
        
               scales = scales * conditioning_scale 
        
               down_block_res_samples = [sample * scale for sample, scale in zip(down_block_res_samples, scales)] 
        
               mid_block_res_sample = mid_block_res_sample * scales[-1]  # last one 
        
           else: 
        
               down_block_res_samples = [sample * conditioning_scale for sample in down_block_res_samples] 
        
               mid_block_res_sample = mid_block_res_sample * conditioning_scale

In MultiControlNet we wrap the forward to handle list of scales

diffusers/src/diffusers/models/controlnets/multicontrolnet.py

Lines 32 to 53 in fb42066

    
           def forward( 
        
               self, 
        
               sample: torch.Tensor, 
        
               timestep: Union[torch.Tensor, float, int], 
        
               encoder_hidden_states: torch.Tensor, 
        
               controlnet_cond: List[torch.tensor], 
        
               conditioning_scale: List[float], 
        
               class_labels: Optional[torch.Tensor] = None, 
        
               timestep_cond: Optional[torch.Tensor] = None, 
        
               attention_mask: Optional[torch.Tensor] = None, 
        
               added_cond_kwargs: Optional[Dict[str, torch.Tensor]] = None, 
        
               cross_attention_kwargs: Optional[Dict[str, Any]] = None, 
        
               guess_mode: bool = False, 
        
               return_dict: bool = True, 
        
           ) -> Union[ControlNetOutput, Tuple]: 
        
               for i, (image, scale, controlnet) in enumerate(zip(controlnet_cond, conditioning_scale, self.nets)): 
        
                   down_samples, mid_sample = controlnet( 
        
                       sample=sample, 
        
                       timestep=timestep, 
        
                       encoder_hidden_states=encoder_hidden_states, 
        
                       controlnet_cond=image, 
        
                       conditioning_scale=scale,

The point of ControlNet Union is reducing the inference cost of multiple controlnets but it seems that creates a limitation that we can't apply scale for each control type. For some experiments we could try applying scale at some another point in the code.

diffusers/src/diffusers/models/controlnets/controlnet_union.py

Lines 751 to 773 in fb42066

    
           for cond, control_idx in zip(controlnet_cond, control_type_idx): 
        
               condition = self.controlnet_cond_embedding(cond) 
        
               feat_seq = torch.mean(condition, dim=(2, 3)) 
        
               feat_seq = feat_seq + self.task_embedding[control_idx] 
        
               inputs.append(feat_seq.unsqueeze(1)) 
        
               condition_list.append(condition) 
        
           condition = sample 
        
           feat_seq = torch.mean(condition, dim=(2, 3)) 
        
           inputs.append(feat_seq.unsqueeze(1)) 
        
           condition_list.append(condition) 
        
           x = torch.cat(inputs, dim=1) 
        
           for layer in self.transformer_layes: 
        
               x = layer(x) 
        
           controlnet_cond_fuser = sample * 0.0 
        
           for idx, condition in enumerate(condition_list[:-1]): 
        
               alpha = self.spatial_ch_projs(x[:, idx]) 
        
               alpha = alpha.unsqueeze(-1).unsqueeze(-1) 
        
               controlnet_cond_fuser += condition + alpha 
        
           sample = sample + controlnet_cond_fuser

vladmandic · 2025-01-27T23:54:06Z

example you've posted is about conditioning_scale.
what about control_guidance_start and control_guidance_end?
if i can control those two, i can get close-to having scale.

anyhow my primary goal here is uniform interface.
secondary is adding functionality over time.
i'm ok with scale being a single value due to model limitations, but i'm not ok with pipeline throwing a random runtime errors because i have to guess what is model expecting and then i'd have to add special if/then code just to deal with special cases.
at very least allow processing using first value in list and log a warning - that would be normal behavior.

hlky · 2025-01-28T06:58:01Z

control_guidance_start, control_guidance_end and controlnet_conditioning_scale are linked, they all control the final cond_scale value.

image = pipe(
    prompt,
    control_image=[controlnet_img, controlnet_img],
    control_mode=[3, 3],
    controlnet_conditioning_scale=[0.5, 0.5],
    control_guidance_start=[0.1, 0.2],
    control_guidance_end=[0.8, 0.9],
    height=1024,
    width=1024,
).images[0]

diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet_union_sd_xl.py

Lines 1135 to 1140 in 7b100ce

    
           # align format for control guidance 
        
           if not isinstance(control_guidance_start, list) and isinstance(control_guidance_end, list): 
        
               control_guidance_start = len(control_guidance_end) * [control_guidance_start] 
        
           elif not isinstance(control_guidance_end, list) and isinstance(control_guidance_start, list): 
        
               control_guidance_end = len(control_guidance_start) * [control_guidance_end]

# control_guidance_start control_guidance_end
[0.1, 0.2] [0.8, 0.9]

diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet_union_sd_xl.py

Lines 1279 to 1284 in 7b100ce

    
           controlnet_keep = [] 
        
           for i in range(len(timesteps)): 
        
               controlnet_keep.append( 
        
                   1.0 
        
                   - float(i / len(timesteps) < control_guidance_start or (i + 1) / len(timesteps) > control_guidance_end) 
        
               )

should be (with controlnet_keep.append(keeps) not keeps[0])

diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet.py

Lines 1235 to 1241 in 7b100ce

    
           controlnet_keep = [] 
        
           for i in range(len(timesteps)): 
        
               keeps = [ 
        
                   1.0 - float(i / len(timesteps) < s or (i + 1) / len(timesteps) > e) 
        
                   for s, e in zip(control_guidance_start, control_guidance_end) 
        
               ] 
        
               controlnet_keep.append(keeps[0] if isinstance(controlnet, ControlNetModel) else keeps)

[[0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [1.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0]]

diffusers/src/diffusers/pipelines/controlnet/pipeline_controlnet.py

Lines 1271 to 1272 in 7b100ce

    
           if isinstance(controlnet_keep[i], list): 
        
               cond_scale = [c * s for c, s in zip(controlnet_conditioning_scale, controlnet_keep[i])]

[0.0, 0.0]

As above we can experiment with applying scale in a different place to get the expected effect. For now the limitation is single value for control_guidance_start, control_guidance_end and controlnet_conditioning_scale. If you'd like we can handle that by taking the first value from the list and logging a warning as you suggested.

vladmandic · 2025-01-28T13:27:35Z

If you'd like we can handle that by taking the first value from the list and logging a warning as you suggested.

yes, anything to reduce runtime errors - library should be able to handle its own limitations (and there always are some limitations) without crashing.

but if we cannot support multiple scale/start/end nicely inside single controlnetunion, perhaps we really should also allow for multiple models to run inside pipeline - just like any other controlnet?

asomoza · 2025-01-28T13:29:18Z

Hi, controlnet union AFAIK doesn't allow anywhere (here or in any other ui/library) to be controlled with start, end or scale for each condition image, it is always for the whole controlnet, the only advantage of it is that you can use multiple condition images for one controlnet, nothing more.

We should make it to be able to be used with other controlnets if people want, or for example, to use a different start,end or scale, people should pass it as another instance which would solve that problem.

Of course, if we can control the start, end and scale of each condition image, it would be ideal but that is a feature request and not an issue.

vladmandic · 2025-01-28T13:34:34Z

@asomoza that's pretty much what i wrote?

to allow controlnet union pipeline to use multi-controlnet as input, not just single controlnet
and additional ask is that pipeline has uniform input args with all other pipelines. so for example, if its normal that pipeline takes list[scale] as input, it should log warning on that instead of crashing with runtime error.

asomoza · 2025-01-28T13:40:46Z

yeah, I was separating the part of the controlling each control type from the issue for future reference and to make it clear.

Your list is what this issue should be about and what should be fixed, just take into consideration that this is a very special controlnet that differs a lot from the others, that why we're having this issues and your feedback helps us a lot with it, I almost never use it with other controlnets so it also passed under my radar.

vladmandic added the bug Something isn't working label Jan 26, 2025

yiyixuxu closed this as completed Jan 27, 2025

yiyixuxu reopened this Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ControlNet union pipeline fails on multi-model #10656

ControlNet union pipeline fails on multi-model #10656

vladmandic commented Jan 26, 2025

hlky commented Jan 26, 2025

vladmandic commented Jan 26, 2025

hlky commented Jan 26, 2025

vladmandic commented Jan 27, 2025

hlky commented Jan 27, 2025

yiyixuxu commented Jan 27, 2025

vladmandic commented Jan 27, 2025

yiyixuxu commented Jan 27, 2025

hlky commented Jan 27, 2025

vladmandic commented Jan 27, 2025

hlky commented Jan 28, 2025

vladmandic commented Jan 28, 2025 •

edited

Loading

asomoza commented Jan 28, 2025

vladmandic commented Jan 28, 2025

asomoza commented Jan 28, 2025

ControlNet union pipeline fails on multi-model #10656

ControlNet union pipeline fails on multi-model #10656

Comments

vladmandic commented Jan 26, 2025

Describe the bug

Reproduction

Logs

System Info

Who can help?

hlky commented Jan 26, 2025

vladmandic commented Jan 26, 2025

hlky commented Jan 26, 2025

vladmandic commented Jan 27, 2025

hlky commented Jan 27, 2025

yiyixuxu commented Jan 27, 2025

vladmandic commented Jan 27, 2025

yiyixuxu commented Jan 27, 2025

hlky commented Jan 27, 2025

vladmandic commented Jan 27, 2025

hlky commented Jan 28, 2025

vladmandic commented Jan 28, 2025 • edited Loading

asomoza commented Jan 28, 2025

vladmandic commented Jan 28, 2025

asomoza commented Jan 28, 2025

vladmandic commented Jan 28, 2025 •

edited

Loading