-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Burning Artifacts in LTX V2V Pipeline with T2V-Generated Videos at Mid-Range Strength Values #103
Comments
When generating the vid-to-vid output, are you using the same seed as the one used for the original video? |
Hi. Thanks for your reply. Indeed, I was using the same seed for both T2V and V2V. Changing the seeds helped resolve the issue. Do you have a possible explanation to why using the same seeds results in such kind of an artifact? Is it something specific in LTX or the general diffusion process. I did not notice this issue with other video or image models. I also checked out your latest commit and tested T2V and it seems when using spacio-temporal guidance (STG) of 1.0 the same artifacts were present in T2V itself. Using a very small STG (e.g. 0.1) results in no such artifacts. |
Hello, can you provide the code for the ltx v2v pipeline? It's very difficult to use them from comfyui.Thanks |
Hi @tayton42, I added the V2V support based on the initial commit of LTX T2V pipeline. Please see below. |
Thank you very much! Can you provide the parameters when calling the call function, and how you get the input video's latents? Thank you very much! |
Hi,
Thanks for your great work.
I am trying to leverage LTX-Video in my research which uses Video-to-Video pipeline.
When I apply LTX V2V on a video generated by LTX itself I get strange burning artifacts for strength values in the middle of its range (e.g. 0.4). These artifacts are reduced for small strength values (0.1) and very high strength values (0.9). Please see an example below:
The first video is a video generated by LTX text-to-video pipeline and it is used to generate the proceeding videos using the V2V pipeline. The value of the strength is indicated in the filename. The burning artifacts are apparent in the trees and rocks especially when using strength 0.4 and 0.7.
input.mp4
strength_0.1.mp4
strength_0.4.mp4
strength_0.7.mp4
strength_0.9.mp4
The issue related to these burning artifacts disappears when I use a different video not generated by LTX or even a screen recorded version of the video generated by LTX. I initially thought that some corruption occurs while saving the result produced by LTX T2V pipeline, however, when directly using its output as an input for V2V, the same issue occurs.
Next, I hypothesized that some inherent noise may be present in the output of LTX T2V due to its VAE decoder. I though of the following two possibilities:
(1) Some noise is present in the LTX's output because the VAE decoder conducts the last denoising step.
(2) Some noise is present in the LTX's output because the VAE decoder has noise injection in its architecture.
However, these two possibilities were rejected since when I tried using the latents of the T2V output as input for the V2V pipeline (without decoding), these artifacts were still present.
At the moment, I think of the following two possibilities for this burning artifacts:
(1) Something in the base model is adding some type of corruption to the output.
(2) The diffusion process results in some type of corruption in the output.
As a side note, I also tried the set-ups mentioned above with CogVideoX and no such burning artifacts are present in its results.
Do you have any thoughts on the problem described above and potential solutions for overcoming it?
Thanks in advance.
The text was updated successfully, but these errors were encountered: