You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am developing an AutoencoderKL fine-tuning script and I have finished the single-GPU training part, but the current feature of the AutoencoderKL model makes it almost impossible for distributed training.
Specifically, the fine-tuning process being implemented contains the reconstruction loss and the kl_loss. And the kl_loss requires the following calls to happen:
posterior=model.encode(pixel_values).latent_distlatents=posterior.sample()
reconstructed=model.decode(latents).samplekl_loss=posterior.kl().mean() # get kl_loss
Two failure cases of implementing distributed learning
However, in distributed learning settings, the AutoencoderKL model here is wrapped up, causing the following 2 implementations of distributed learning impractical. Both of the following implementations can work, but they are extremely SLOW.
Accessing encode and decode using .module call
posterior=model.module.encode(pixel_values).latent_dist# call `.module` of the wrapped model, but this takes timelatents=posterior.sample()
reconstructed=model.module.decode(latents).sample# call `.module` of the wrapped model, but this takes time
Unwrap the model before training loop
unwrapped_model=accelerator.unwrap_model(model)
fori, batchinenumerate(train_dataloader):
# do somethingposterior=unwrapped_model.encode(pixel_values).latent_dist# use the unwrapped model to accesslatents=posterior.sample()
reconstructed=unwrapped_model.decode(latents).sample# use the unwrapped model to access# do the rest
Request
As a result, I am hoping to ask for an option to return the intermediate values of the AutoencoderKL model, such as the posterior that can be used for fine-tuning the model. Or any other way that can make this happen would work.
Thanks!
Open source status
The model implementation is available.
The model weights are available (Only relevant if addition is not a scheduler).
Provide useful links for the implementation
The fine-tuning code is NOT available yet but will be available in the near future.
The implementation of the core problem is provided above.
The text was updated successfully, but these errors were encountered:
Model/Pipeline/Scheduler description
Environment
Using
diffusers==0.32.2
and Pytorch2.5.1
Context
I am developing an AutoencoderKL fine-tuning script and I have finished the single-GPU training part, but the current feature of the AutoencoderKL model makes it almost impossible for distributed training.
Specifically, the fine-tuning process being implemented contains the
reconstruction loss
and thekl_loss
. And thekl_loss
requires the following calls to happen:Two failure cases of implementing distributed learning
However, in distributed learning settings, the AutoencoderKL model here is wrapped up, causing the following 2 implementations of distributed learning impractical. Both of the following implementations can work, but they are extremely SLOW.
Accessing
encode
anddecode
using.module
callUnwrap the model before training loop
Request
As a result, I am hoping to ask for an option to return the intermediate values of the AutoencoderKL model, such as the posterior that can be used for fine-tuning the model. Or any other way that can make this happen would work.
Thanks!
Open source status
Provide useful links for the implementation
The fine-tuning code is NOT available yet but will be available in the near future.
The implementation of the core problem is provided above.
The text was updated successfully, but these errors were encountered: