Adding AutoencoderKL model returns option request #10614

lin-tianyu · 2025-01-21T00:57:40Z

Model/Pipeline/Scheduler description

Environment

Using diffusers==0.32.2 and Pytorch 2.5.1

Context

I am developing an AutoencoderKL fine-tuning script and I have finished the single-GPU training part, but the current feature of the AutoencoderKL model makes it almost impossible for distributed training.

Specifically, the fine-tuning process being implemented contains the reconstruction loss and the kl_loss. And the kl_loss requires the following calls to happen:

posterior = model.encode(pixel_values).latent_dist   
latents = posterior.sample()    
reconstructed = model.decode(latents).sample    

kl_loss = posterior.kl().mean() # get kl_loss

Two failure cases of implementing distributed learning

However, in distributed learning settings, the AutoencoderKL model here is wrapped up, causing the following 2 implementations of distributed learning impractical. Both of the following implementations can work, but they are extremely SLOW.

Accessing `encode` and `decode` using `.module` call

posterior = model.module.encode(pixel_values).latent_dist   # call `.module` of the wrapped model, but this takes time
latents = posterior.sample()    
reconstructed = model.module.decode(latents).sample    # call `.module` of the wrapped model, but this takes time

Unwrap the model before training loop

unwrapped_model = accelerator.unwrap_model(model)
for i, batch in enumerate(train_dataloader):
    # do something
    posterior = unwrapped_model.encode(pixel_values).latent_dist   # use the unwrapped model to access
    latents = posterior.sample()    
    reconstructed = unwrapped_model.decode(latents).sample    # use the unwrapped model to access
    # do the rest

Request

As a result, I am hoping to ask for an option to return the intermediate values of the AutoencoderKL model, such as the posterior that can be used for fine-tuning the model. Or any other way that can make this happen would work.

Thanks!

Open source status

The model implementation is available.
The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

The fine-tuning code is NOT available yet but will be available in the near future.

The implementation of the core problem is provided above.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding AutoencoderKL model returns option request #10614

Adding AutoencoderKL model returns option request #10614

lin-tianyu commented Jan 21, 2025

Adding AutoencoderKL model returns option request #10614

Adding AutoencoderKL model returns option request #10614

Comments

lin-tianyu commented Jan 21, 2025

Model/Pipeline/Scheduler description

Environment

Context

Two failure cases of implementing distributed learning

Accessing encode and decode using .module call

Unwrap the model before training loop

Request

Open source status

Provide useful links for the implementation

Accessing `encode` and `decode` using `.module` call